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ABSTRACT 


Airspace  encounter  models,  covering  close  encounter  situations  that  may  oc¬ 
cur  after  standard  separation  assurance  has  been  lost,  are  a  critical  component  in 
the  safety  assessment  of  aviation  procedures  and  collision  avoidance  systems.  Of 
particular  relevance  to  Unmanned  Aircraft  Systems  (UAS)  is  the  potential  for  en¬ 
countering  general  aviation  aircraft  that  are  flying  under  Visual  Flight  Rules  (VFR) 
and  which  may  not  be  in  contact  with  air  traffic  control.  In  response  to  the  need 
to  develop  a  model  of  these  types  of  encounters,  Lincoln  Laboratory  undertook  an 
extensive  radar  data  collection  and  modeling  effort  involving  more  than  120  sensors 
across  the  U.S.  This  report  describes  the  structure  and  content,  of  that  encounter 
model.  The  model  is  based  on  the  use  of  Bayesian  networks  to  represent  relation¬ 
ships  between  dynamic  variables  and  to  construct  random  aircraft  trajectories  that 
are  statistically  similar  to  those  observed  in  the  radar  data.  The  result  is  a  frame¬ 
work  from  which  representative  intruder  trajectories  can  he  generated  and  used  in 
fast-time  Monte  Carlo  simulations  to  provide  accurate  estimates  of  collision  risk. 

The  model  described  ill  this  report  is  one  of  three  developed  by  Lincoln  Labo¬ 
ratory.  An  encounter  with  ail  intruder  that  does  not  have  a  transponder,  or  between 
two  airc  raft  using  a  Mode  A  code  of  1200  (VFR).  is  uncorrelated  in  the  sense  that  it 
is  unlikely  that  there  would  be  prior  intervention  bv  air  traffic  control.  The  uncorre¬ 
lated  model  described  in  this  report  is  based  on  transponder-equipped  aircraft  using 
a  1200  (VFR)  Mode  A  code  observed  by  radars  across  the  U.S.  As  determined  from 
a  comparison  against  primary-only  tracks,  in  addition  to  representing  VFR-on-VFR 
encounters,  this  model  is  representative  of  encounters  between  a  cooperative  aircraft 
and  conventional  nonoooperative  aircraft  similar  to  those  that  use  a  1200  transpon¬ 
der  code.  A  second  uncorreiated  model  is  also  being  developed  for  unconventional 
aircraft  that  have  different  flight  characteristics  than  1200-code  aircraft.  Finally,  a 
correlated  encounter  model  has  been  developed  to  represent  situations  in  which  it  is 
likely  that  there  would  be  air  traffic  control  intervention  prior  to  a  close  encounter. 
The  correlated  model  applies  to  intruders  that  are  using  a  discrete  (non- 1200)  code. 

Separate  electronic  files  are  available  from  Lincoln  Laboratory  that  contain 
the  statistical  data  required  to  generate  and  validate  encounter  trajectories.  Details 
on  how  to  interpret  the  data  file  and  an  example  of  how  to  randomly  construct 
trajectories  are  provided  in  Appendices  A  and  B.  respectively.  A  Matlah  software 
package  is  also  available  to  generate  random  encounter  trajectories  based  on  the 
data  tables. 

A  byproduct  of  the  encounter  modeling  effort  was  the  development  of  National 
aircraft  track  and  traffic  density  databases.  Example  plots  of  traffic  density  data 
are  provided  in  this  report,  but  the  complete  track  and  density  databases  are  not 
provided  in  electronic  form  due  to  their  size  arid  the  complexity  of  processing  spec  ific 
locations,  altitudes,  and  times. 
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1.  INTRODUCTION 


One  of  the  main  challenges  to  integrating  unmanned  aircraft  into  the  National  Airspace  Sys¬ 
tem  (NAS)  is  the  development  of  systems  that  are  able  to  sense  and  avoid  local  air  traffic.  If 
designed  properly,  these  collision  avoidance  systems  could  provide  an  additional  layer  of  protec¬ 
tion  that  maintains  or  even  enhances  the  current  exceptional  level  of  aviation  safety.  However, 
due  to  tlieii  safety-critical  nature,  rigorous  assessment  is  required  before  sufficient  confidence  can 
exist  to  certify  collision  avoidance  systems  for  operational  use.  Evaluations  typically  include  flight 
tests,  operational  impact  studies,  and  simulation  of  millions  of  traffic  encounters  with  the  goal  of 
exploring  the  robustness  of  the  collision  avoidance  system.  Key  to  these  simulations  are  so-called 
encounter  models  that  describe  the  statistical  makeup  of  the  encounters  in  a  way  that  represents 
what  actually  occurs  in  the  airspace.  Each  encounter  generated  from  the  model  specifies  the  initial 
positions  and  orientations  of  two  airc  raft  and  the  nominal  dynamic  maneuvers  (not  affected  by  a 
collision  avoidance  system)  that  may  occur  leading  up  to  the  closest  point  of  approach.  VVliat  is 
produced,  then,  is  the  geometric,  dynamic  situation  a  collision  avoidance  system  may  be  expected 
to  resolve  safely.  Identical  encounter  situations  are  typically  simulated  with  and  without  a  collision 
avoidance  system  so  that  relative  system  benelits  may  be  quantified.  Knowledge  of  the  rate's  at 
which  encounter  situations  occur  in  the  NAS  can  also  be  used  to  estimate  the  rate  of  near  mid-air 
collisions  per  flight  hour  and  thus  arrive'  at  an  overall  risk  metric. 

In  these  models,  encounter  situations  are  abstracted  in  the  sense  that  there  is  no  consideration 
of  an  explic  it  location  or  local  airspace  structure  (e.g..  airways,  metering  fixes,  approach  paths). 
Rather,  the  encounters  represent  what  may  be  statistically  expected  to  occur  over  the  lifetime  of  a 
given  system.  If  desired,  a  particular  altitude  layer  or  airspace  class  (e.g..  Class  E)  can  be  specified 
and  aircraft  behavior  will  be  representative  of  that  particular  region.  Additionally,  the  flight  path 
of  one  airc  raft  can  be  constrained  to  focus,  for  instance,  on  a  particular  departure  profile  or  flight 
condition.  It  should  also  be  noted  that  the  models  cover  approximately  1  minute  before  closest 
point  of  approach  and  so  are  not  appropriate  for  large-scale  air  traffic  impact  studies  (e.g..  to 
examine  sector  loading  or  conflict  rates):  the  focus  here  is  on  events  in  which  loss  of  separation  has 
already  occurred  between  two  aircraft  and  collision  avoidance  becomes  paramount. 

One  system  that  has  been  rigorously  tested  in  this  manner  is  the  Traffic  alert  and  Collision 
Avoidance  System  (TCAS).  As  part  of  the  TCAS  certification  process  in  the  1980s  and  1990s, 
several  organizations  tested  the  system  across  millions  of  simulated  close  encounters  and  evaluated 
the  risk  of  a  near  mid-air  collision  (NMAC.  defined  as  separation  less  than  500  ft  horizontally 
and  100 ft  vertically)  [1  4].  This  analysis  ultimately  led  to  the  certification  and  U.S.  mandate 
for  TCAS  equipage  on  larger  transport  aircraft.  More  recently.  Eurocontrol  and  the  International 
Civil  Aviation  Organization  (ICAO)  performed  similar  sets  of  simulation  studies  for  European  and 
worldwide  TCAS  mandates  [5,6]. 

The  design  of  a  collision  avoidance  system  represents  a  careful  balance  to  enhance  safety 
while  ensuring  a  low  rate  of  unnecessary  maneuvers.  This  balance  is  strongly  affected  by  the  types 
of  encounter  situations  to  which  the  system  is  exposed.  It  is  therefore  important  that  simulated 
encounters  are  representative  of  those  that  occur  in  the  airspace.  Hence,  tremendous  effort  has 
been  made  by  various  institutions  since  the  early  1980s  to  develop  encounter  models  based  on 


1 


radar  data  [1,3,7  10].  The  primary  contribution  of  this  report  is  to  introduce  a  new  approach 
to  encounter  modeling  that  is  based  on  a  Bayesian  statistical  framework  and  which  uses  recent 
radar  data  collected  across  the  U.S.  The  advantage  of  applying  a  Bayesian  framework  is  that  it 
allows  us  to  optimally  leverage  available  radar  data  to  produce  a  model  that  is  representative  of 
actual  operations.  The  result  is  a  framework  from  which  representative  intruder  trajectories  can  be 
generated  and  used  in  fast-time  Monte  Carlo  simulations  to  provide  accurate  estimates  of  collision 
risk. 


A  byproduct  of  the  encounter  modeling  effort  was  the  development  of  aircraft  track  and 
traffic  density  databases  of  the  National  Airspace  System.  Example  plots  of  traffic  density  data  are 
provided  in  this  report,  but  the  complete  track  and  density  databases  are  not  provided  in  electronic 
form  due  to  their  size  and  the  complexity  of  processing  specific  locations,  altitudes,  and  times. 


1.1  ENCOUNTER  TYPES 

The  encounters  covered  by  this  model  are  those  involving  aircraft  in  the  final  stages  before  a 
collision.  It  is  assumed  that  prior  safety  layers  (e.g.,  airspace  structure,  Air  Traffic  Control  (ATC) 
advisories  or  vectors)  have  failed  to  maintain  standard  separation  distances  between  aircraft.  The 
model  is  therefore  useful  in  describing  the  types  of  situations  that  need  to  be  addressed  by  a  collision 
avoidance  system,  but  will  not  address  other  separation  aspects  such  as  ATC  communications  or 
coordination. 

Because  they  are  by  far  the  most  likely  to  occur,  only  pairwise  (two-aircraft)  encounters 
are  explicitly  discussed  in  this  model.  If  required,  the  probability  of  multiple  aircraft  encounters 
can  be  determined  from  the  traffic  density  database.  Random  multi-aircraft  encounters  can  be 
constructed,  if  desired,  by  combining  several  trajectories  generated  from  the  model  presented  here. 

There  are  two  fundamental  types  of  encounters.  In  the  first,  both  aircraft  involved  are  coop¬ 
erative  (i.e.,  have  a  transponder)  and  at  least  one  is  in  contact  with  air  traffic  control.  It  is  then 
likely  that  at  least  one  aircraft  will  receive  some  notification  about  the  traffic  conflict  and  begin 
to  take  action  before  a  collision  avoidance  system  gets  involved.  We  term  this  type  of  encounter 
“correlated”  because  the  trajectories  of  each  aircraft  may  involve  maneuvers  that  are  correlated  to 
some  degree  due  to  this  prior  intervention.  The  second  type  of  encounter  we  term  “uncorrelated” 
and  involves  at  least  one  noncooperative  aircraft  (i.e.,  not  using  a  transponder)  or  two  aircraft 
flying  under  Visual  Flight  Rules  (VFR)  without  flight  following  (i.e.,  using  a  transponder  Mode 
A  code  of  1200).  In  these  encounters,  it  is  unlikely  that  air  traffic  control  would  become  involved 
prior  to  the  close  encounter;  rather  the  two  aircraft  must  rely  solely  on  visual  acquisition  (or  some 
other  collision  avoidance  system)  at  close  range  to  remain  separated.  Such  encounters  tend  to  be 
uncorrelated  since  there  is  no  coordinated  intervention  prior  to  the  close  encounter:  the  assumption 
is  that  the  two  aircraft  blunder  into  close  proximity.  A  complete  evaluation  of  unmanned  systems 
will  require  analysis  using  both  correlated  and  uncorrelated  models. 

This  report  focuses  on  uncorrelated  encounter  modeling.  In  this  report,  we  base  our  uncorre- 
latcd  model  on  beacon  reports  from  aircraft  squawking  1200,  not  radar  returns  from  noncooperative 
traffic.  Radar  surveillance  of  noncooperative  targets  is  complicated  due  to  clutter  and  missed  detec- 
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tions.  making  identification  of  real  tracks  difficult.  Another  modeling  effort  is  underway  specifically 
focused  on  truly  noncooperative  aircraft  tracks.  The  lack  of  a  transponder  means  that  only  position 
information,  and  no  identity  code  or  altitude,  is  available.1  Hence,  it  is  difficult  to  infer  vertical 
rates,  an  important  component  of  an  encounter  model. 

Beacon-equipped  aircraft  can  transmit  either  a  discrete  Mode  A  code  or  the  11011-discrete 
code  of  1200.  Aircraft  using  code  1200  tend  to  be  general  aviation  flying  VFR.  They  generally  fly 
at  low  altitudes  and  make  significantly  more  maneuvers  than  transport  aircraft,  both  horizontally 
and  vertically.  Thus  to  a  large  degree  they  resemble  noncooperative  aircraft.  Due  to  the  poor 
quality  of  noncooperative  data,  we  use  1200  tracks  here  as  surrogates  for  primary-only  tracks  for 
the  construction  of  this  model.  Other  work  underway  toward  processing  true  noncooperative  tracks 
will  be  described  in  a  future  document. 

Additional  analysis  (Appendix  D)  shows  that  1200-code  tracks  are  a  proper  surrogate  for  cer¬ 
tain  classes  of  noncooperative  traffic  in  the  National  Airspace  System,  but  there  are  other  categories 
of  noncooperative  targets  for  which  they  are  not  suitable.  For  example,  most  balloons,  ultralights, 
and  gliders  do  not  fly  like  transponder-equipped  aircraft  squawking  1200.  The  challenge  in  devel¬ 
oping  models  for  unconventional  aircraft  such  as  balloons,  ultralights,  and  gliders  is  the  lack  of 
high-quality  data.  As  these  data  become  available,  the  same  processes  outlined  in  this  report  will 
be  used  to  estimate  a  noncooperative  model  for  unconventional  aircraft.  Additionally,  where  such 
data  are  not  available,  a  model  can  be  manually  constructed  based  011  knowledge  of  typical  flight 
trajectories  and  performance  limits. 

Table  1  shows  which  encounter  model  to  use  depending  on  the  types  of  aircraft  involved  in 
the  encounter.  There  are  three  types  of  encounter  models  shown  in  this  Table:  correlated  (C), 
uncorrelated  conventional  (U),  and  uncorrelated  unconventional  (X).  In  terms  of  aircraft  types, 
Discrete  indicates  a  cooperative  aircraft  using  a  non- 1200  Mode  A  transponder  code,  and  VFR 
denotes  a  cooperative  aircraft  using  the  1200  Mode  A  transponder  code.  Only  the  uncorrelated 
conventional  model  (bold  U)  is  discussed  in  this  report:  the  other  models  are  described  in  other 
reports. 


1.2  RADAR  DATA 

The  radar  data  used  to  build  the  model  comes  from  the  84th  Radar  Evaluation  Squadron  (RADES) 
at  Hill  AFB.  Utah.  RADES  receives  radar  data  from  FAA  and  Department,  of  Defense  sites  through¬ 
out  the  United  States.  They  maintain  continuous  real-time  feeds  from  a  network  of  sensors,  includ¬ 
ing  long-range  ARSR-4  radars  around  the  perimeter  of  the  United  States  and  short-range  ASR-8, 
ASR-9.  and  ASR-11  radars  in  the  interior.  Radar  ranges  vary  from  60  to  250  NM.  Figure  1  shows 
the  approximate  coverage  by  the  more  than  120  sensors  that  were  used  to  construct  the  model. 
Figure  2  shows  the  average  density  of  VFR  traffic  between  December  2007  and  August  2008,  and 
Appendix  H  plots  flight  hour  densities  at  different  altitude  layers.  Appendix  G  summarizes  the 
volume  of  data  received  from  each  sensor  for  the  construction  of  the  model. 

1  Some  military  radars  have  height-finding  capability,  although  the  accuracy  of  the  altitude  generated  is  significantly 
inferior  to  the  transponder  Mode  C  altitude. 


TABLE  1 


Encounter  model  types. 


C  =  correlated,  U  =  uncorrelated  conventional,  X  =  uncorrelated  unconventional 


Intruder  Aircraft 

Aircraft  of  Interest 

Discrete 

VFR 

Discrete 

c 

C 

VFR 

c 

u 

Noncooperative  conventional  (fixed-wing  powered  aircraft) 

u 

u 

Noncooperative  unconventional  (balloon,  glider) 

X 

X 

To  provide  the  necessary  information  to  model  aircraft  trajectories,  two  weeks  of  1200-code 
aircraft  reports  were  collected  from  more  than  120  sensors,  covering  the  time  periods  1  7  December 
2007  and  1  7  June  2008.  As  discussed  in  Appendix  C  it  was  determined  that  these  two  weeks  of 
data  provided  sufficient  statistical  power  and  generality  to  enable  a  valid  and  representative  model 
of  1200-code  flight  characteristics.  In  addition  to  this  focused  two-week  data  set,  all  aircraft  tracks 
from  the  complete  nine-month  set  of  radar  data  were  archived  and  are  available  for  traffic  density 
studies  as  needed. 

Note  that  there  are  a  number  of  advantages  to  the  RADES  data  feed  compared  to  the  En¬ 
hanced  Traffic  Management  System  (ETMS)  data  often  used  in  airspace  analyses.  ETMS  data 
include  only  cooperative  aircraft  on  filed  IFR  flight  plans  and  provides  updates  once  per  minute 
showing  aircraft  position  after  processing  by  air  traffic  control  automation.  In  contrast,  the  RADES 
data  feed  streams  continuously  directly  from  the  radar,  including  primary-only  radar  returns  as 
well  as  cooperative  transponder  returns  (whether  on  a  flight  plan  or  not),  providing  track  updates 
every  5  or  12  seconds  without  being  affected  by  automation  systems.  This  ensures  that  our  filters 
and  trackers  have  the  best  raw  data  with  which  to  begin  processing. 

National  Offload  Program  (NOP)  data  is  another  source  that  could  have  been  used  to  con¬ 
struct  this  model.  An  advantage  of  NOP  data  is  the  inclusion  of  flight-plan  and  aircraft-type 
information.  However,  NOP  data  is  post-automation,  like  ETMS,  does  not  include  data  from 
Department  of  Defense  sensors,  and  does  not  have  as  comprehensive  coverage  as  the  RADES  feed. 


1.3  PROCESS  OVERVIEW 

Figure  3  diagrams  the  steps  involved  in  processing  radar  data  to  build  the  encounter  model  and 
generate  encounters  for  simulation.  The  high-level  approach  is  to  model  nominal  aircraft  trajectories 
using  Markov  models  represented  by  Bayesian  networks  (Section  2).  The  first  step  in  constructing  a 
Markov  model  involves  extracting  features,  such  as  turn  rate  and  vertical  rate,  from  the  radar  data. 
From  these  features,  sufficient  statistics  are  collected  to  describe  the  distribution  over  maneuvers 
and  other  properties  of  trajectories.  Bayesian  model  selection  methods  are  used  to  search  for  the 
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Figure  1.  Radar  coverage  map. 


Figure  2.  Cumulative  VFR  (1200- code)  flight  hours  per  1/6  x  1/6  degree  cell  for  the  period  1  December 
2007  to  SI  August  2008.  The  locations  of  the  sensors  are  indicated  by  magenta  circles.  White  indicates  no 
1200-code  reports  were  found  in  a  cell  due  to  lack  of  traffic ,  radar  coverage ,  or  terrain  masking. 
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best  network  structure  that  represents  the  observed  data  (Section  3).  The  best  network  structure 
is  then  selected  and  the  associated  sufficient  statistics  are  obtained  to  generate  new  trajectories 
that  are  representative  of  those  observed  by  radar  (Section  4).  Finally,  the  trajectories  are  used  in 
a  dynamic  simulation  to  evaluate  encounters  between  aircraft  with  or  without  a  collision  avoidance 
system  (Section  5).  Section  6  discusses  using  the  encounter  model  for  large-scale  safety  analysis 
and  collision  risk  estimation. 

A  series  of  Appendices  is  also  included  to  provide  additional  detail  and  background  data. 
Briefly:  Appendix  A  describes  the  parameters  and  data  formats  used  to  represent  encounters  in  the 
model;  Appendix  B  describes  how  to  use  the  data  file  and  model  parameters  to  generate  encounter 
trajectories;  Appendix  C  describes  a  comparison  of  traffic  characteristics  at  varying  locations  and 
times  during  the  year;  Appendix  D  discusses  the  validity  of  the  model  toward  representing  truly 
noncooperative  aircraft;  Appendix  E  describes  the  process  to  be  followed  to  validate  encounter 
trajectories  generated  from  the  encounter  model;  Appendix  F  provides  more  detail  on  the  radar 
data  processing  and  track  fusion  process;  Appendix  G  lists  the  radars  included  in  the  RADES  data 
feed;  Appendix  H  shows  traffic  density  maps  for  the  U.S.  at  varying  altitude  levels;  Appendix  I 
provides  a  brief  review  of  Bayesian  networks;  and  finally,  Appendix  J  provides  an  archive  of  the 
different  Bayesian  networks  that  were  tested  as  candidates  for  the  final  model. 

A  digital  representation  of  the  sufficient  statistics  and  model  structure  described  in  this  report 
are  available  on  request  from  Lincoln  Laboratory.  An  example  of  using  the  data  in  that  file  to 
construct  a  random  trajectory  is  provided  in  Appendix  B.  Additionally,  a  Matlab  software  package 
is  available  to  generate  random  trajectories  using  the  data  tables. 

Throughout  this  report  we  use  the  standard  units  used  in  aviation.  In  particular,  altitudes 
are  in  feet,  positions  are  in  nautical  mile  coordinates,  vertical  rates  are  in  feet  per  minute,  turn 
rates  are  in  degrees  per  second,  airspeeds  are  in  knots  (true  airspeed),  and  accelerations  are  in 
knots  per  second.  Time  is  reported  in  seconds. 
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Figuiv  J.  Modeling  and  simulation  process  overview.  The  sufficient  statistics  and  network  stnicturc  (in  bold) 
arc  the  main  elements  provided  as  part  of  this  work. 
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2.  MODEL 


We  model  nominal  flight,  i.e.,  flight  without  avoidance  maneuvering,  using  a  Markov  process 
represented  by  a  dynamic  Bayesian  network.  A  Markov  process  is  a  stochastic  process  where  the 
probability  distribution  over  future  states  is  conditionally  independent  of  past  states  given  the 
present  state.  In  other  words,  one  only  needs  to  know  the  present  state  to  predict  the  next  state. 

The  states  in  the  model  specify  how  the  position,  altitude,  and  airspeed  of  each  aircraft 
involved  in  an  encounter  change  over  time.  In  particular,  each  state  specifies  a  vertical  rate  /*. 
turn  rate  tp,  and  airspeed  acceleration  v.  Given  an  initial  airspeed  ik  position  (j*.  y.  /?)>  heading  t/\ 
vertical  rate  /?,  altitude  layer  L,  and  airspace  class  A ,  one  can  determine  from  the  model  how  the 
aircraft  trajectories  evolve  over  the  course  of  the  encounter. 

One  way  to  represent  a  Markov  model  is  with  an  exhaustive  state-transition  matrix  that 
specifies  the  probability  of  transitioning  between  all  pairs  of  states.  Unfortunately,  the  number 
of  independent  parameters  (that  is.  independent  probabilities)  required  to  define  the  matrix  grows 
super-exponent  iallv  with  the  number  of  variables  defining  the  model.  The  more  independent  param¬ 
eters  there  are  in  the  model,  the  more  data  one  needs  to  properly  estimate  their  values.  However, 
by  using  dynamic  Bayesian  networks,  we  can  leverage  conditional  independence  between  some  vari¬ 
able's  to  greatly  reduce  the  number  of  parameters.  We  can  learn  the  structure  of  the*  dynamic 
Bayesian  network  by  maximizing  the  posterior  probability  of  the  network  structure  given  the  data. 

Appendix  I  provides  the  necessary  background  on  Bayesian  networks.  The  remainder  of  this 
section  defines  an  encounter  in  further  detail,  introduces  the  variables  used  to  describe  an  encounter, 
and  presents  the  modeling  structure. 


2.1  MODEL  VARIABLES 

There  are  six  variables  in  the  uncorrelated  encounter  model: 

•  Airspace  class  A:  The  airspace  class  variable  was  incorporated  into  the  model  to  account 
for  the  variation  in  how  aircraft  fly  in  different  airspace  (‘lasses.  This  variable  may  take  on  one 
of  four  values:  B,  C,  D,  and  O,  indicating  which  class  of  airspace  the  aircraft  is  in.  The  values 
B,  C,  and  D  correspond  to  the  controlled  airspace  classes  defined  by  the  FA  A.  The  value  O 
represents  “other  airspace,’*  including  airspace  Classes  E  and  G.  Note  that  there  should  be 
no  VFR  aircraft  in  Class  A  due  to  the  requirement  that  aircraft  in  that  Class  of  airspace  fly 
under  instrument  Flight  Rules. 

•  Altitude  layer  L\  Airspace  is  divided  into  four  altitude  layers,  in  a  process  similar  to 
prior  encounter  models  developed  by  Eurocontrol.  The  first  layer  spans  from  500  to  1200  ft 
Above  Ground  Level  (AGL)  to  capture  aircraft  in  the  traffic  pattern  or  performing  low- 
level  maneuvers.  The  second  layer  spans  a  transition  zone  from  1200  to  3000  ft  AGL.  the 
cruise  altitude  where  the  hemispheric  rule  begins.  The  third  layer  spans  from  3000 ft  AGL 
to  5000  ft  AGL  covering  a  mix  of  low-altitude  enroute  and  maneuvering  aircraft.  The  fourth 
layer  includes  airspace  above  5000  ft  AGL  and  would  cover  most  enroute  VFR  traffic. 


9 


•  Airspeed  v:  We  model  true  airspeed  and  allow  it  to  vary  during  flight. 

•  Acceleration  v:  We  allow  airspeed  acceleration  to  vary  every  second,  providing  higher- 
fidelity  motion  than  prior  models. 

•  Turn  rate  ip:  Turn  rate  is  permitted  to  change  every  second  in  our  model.  The  prior 
European  and  ICAO  cooperative  models  allowed  only  a  single  turn  during  an  encounter. 

•  Vertical  rate  /?:  The  vertical  rate  is  permitted  to  change  at  every  second.  All  prior  cooper¬ 
ative  models  allowed  only  a  single  vertical  acceleration  period  during  an  encounter. 

Because  many  of  the  variables  are  closely  related  due  to  physical  constraints  and  flight  char¬ 
acteristics  (e.g.,  turn  rate  and  vertical  rate)  is  it  important  to  properly  represent  correlations  in  the 
model.  Independently  sampling  from  distributions  for  turn  rate  and  vertical  rate  would  miss  these 
important  relationships.  The  remainder  of  this  section  explains  how  to  model  joint  probability 
distributions  over  these  variables  to  ensure  proper  consideration  of  correlations.  To  generate  an 
encounter,  we  first  randomly  sample  from  the  joint  distribution  over  the  encounter  variables  to 
define  the  encounter  geometry  and  initial  conditions.  Second,  we  use  a  Markov  model  to  determine 
how  the  dynamic  variables,  i.e.,  turn  rate,  vertical  rate,  and  airspeed  acceleration,  evolve  during 
the  course  of  the  encounter.  There  are  two  corresponding  separate  probability  distributions  in  the 
model:  an  initial  distribution  to  set  up  an  encounter  situation,  and  a  transition  distribution  to 
describe  how  the  dynamic  variables  specifying  the  trajectories  of  the  aircraft  evolve  over  time. 


2.2  INITIAL  DISTRIBUTION 

The  aircraft  encounter  model  represents  the  distribution  over  the  initial  values  of  /?,  ip,  v,  v,  L,  and  A 
as  well  as  the  time-varying  history  of  /?,  ip,  and  v  during  the  course  of  an  encounter.  Probabilistic 
relationships  between  these  variables  are  represented  using  a  Bayesian  network.  An  example  of 
such  a  relationship  is  the  one  between  turn  rate  and  vertical  rate.  Without  properly  capturing 
this  dependency  and  other  important  dependencies,  unrealistic  situations  may  be  generated,  e.g., 
involving  aircraft  with  simultaneously  high  climb  rates  and  high  turn  rates.  Initial  position  and 
altitude  is  determined  through  a  separate  process  described  in  Section  5. 

A  Bayesian  network  (e.g.,  Figure  4a)  includes  a  series  of  nodes  represented  by  rectangles  and 
directed  arcs  represented  by  arrows.  Each  node  corresponds  to  a  particular  variable  that  may  take 
on  one  of  several  discrete  values  with  associated  probabilities.  Certain  variables,  such  as  airspace 
class  or  altitude  layer,  are  naturally  quantized  into  a  few  discrete  values  (e.g.,  B,  C,  D,  and  O 
for  airspace  class).  Continuous  variables,  such  as  vertical  rate  or  turn  rate,  are  described  by  a 
series  of  discrete  bins  from  which  a  continuous  value  is  later  selected.  Within  each  node,  then,  is 
a  description  of  the  possible  values  a  variable  can  take  and  the  probability  that  each  value  will 
occur.  The  directed  arcs  describe  how  the  probabilities  of  one  variable  depend  on  other  variables. 
Arrows  leading  into  a  particular  node  denote  which  parent  nodes  must  first  be  evaluated  in  order 
to  select  the  value  of  the  given  node.  For  example,  referring  to  Figure  4a,  the  probabilities  for  node 
L  depend  on  the  value  of  node  A ;  the  probabilities  for  node  v  depend  on  the  values  of  nodes  A  and 


10 


L.  In  the  latter  ease,  for  instance,  there  would  be  a  conditional  probability  table  describing  the 
probability  of  each  possible  value  of  v  jointly  conditioned  on  the  values  of  A  and  L:  P(v  \  A ,  L). 

Because  there  are  many  possible  ways  variables  can  be  connected  in  the  model,  it  is  necessary 
to  use  a  quantifiable  scoring  process  to  evaluate  the  quality  of  each  candidate  network.  For  this 
model,  we  used  a  Bayesian  scoring  process  (Appendix  I)  to  evaluate  different  Bayesian  network 
structures  and  choose  a  structure  that  optimally  represents  the  trajectories  we  observed.  Figure  4a 
shows  the  optimal  structure  for  the  initial  distribution.  Appendix  .1  shows  other  candidate  network 
structures  and  their  scores. 

Given  a  structure,  sufficient  statistics  extracted  from  data,  and  a  Bayesian  prior,  we  then 
sample  from  the  Bayesian  network  to  produce  initial  airspace  classes,  altitude  layers,  vertical  rates, 
turn  rates,  airspeeds,  and  accelerations  that  are  representative  of  those  found  in  the  data.  The  nodes 
and  directed  arcs  used  in  the  structural  diagram  show  the  order  in  which  this  sampling  occurs.  For 
example,  based  on  the  structure  in  Figure  4a.  to  determine  the  initial  state  of  the  aircraft  we  first 
randomly  determine  an  airspace  class  A.  Once  the  airspace  class  has  been  determined,  an  altitude 
layer  L  is  selected.  The  probabilities  associated  with  each  altitude  layer  depend  on  which  airspace 
class  was  chosen  earlier.  Once  A  and  L  have  been  selected,  we  then  randomly  select  airspeed  e, 
and  so  on.  An  example  of  working  through  this  process  is  provided  in  Appendix  B.  Alternately,  we 
could  assign  outright  an  airspace  class  and  altitude  layer  for  a  particular  study  and  then  randomly 
select  values  for  the  remaining  variables. 


2.3  TRANSITION  DISTRIBUTION 

We  use  a  separate  Bayesian  network  to  model  how  the  variables  h,  0,  and  v  evolve  over  time.  In  t  his 
network,  we  have  a  first  layer  that  represents  the  state  of  the  system  at  the  present  time  step  and  a 
second  layer  that  represents  the  state  of  the  system  at  the  next  time  step.  There  are  dependencies 
between  layers  and  within  the  second  layer.  Such  a  two-layer  temporal  Bayesian  network  is  known 
as  a  dynamic  Bayesian  network  [11.  12],  Parameter  and  structure  learning  in  dynamic  Bayesian 
networks  is  similar  to  regular  Bayesian  networks  (Appendix  I). 

Figure  4b  shows  the  structure  used  for  the  transition  distribution.  Again,  we  chose  the 
highest-scoring  network  structure  among  our  candidate  network  structures  (see  Appendix  .1).  Given 
a  structure,  sufficient  statistics  extracted  from  data,  and  a  prior,  we  can  then  sample  from  the 
Bayesian  network  to  determine  the  next  vertical  rate,  turn  rate,  and  airspeed  acceleration  command. 

In  general,  time  steps  in  dynamic  Bayesian  networks  may  be  of  any  duration,  but  for  our 
modeling  effort  we  chose  steps  of  Is.  Shorter  time  stej>s  allow  for  more  frequent  variations  in 
airspeed,  vertical  rate,  and  turn  rate,  but  they  require  more  computation  per  unit  of  simulation 
time.  Time  stops  of  1  s  balance,  maneuver  complexity  with  computation  and  are  also  appropriate 
given  typical  timescales  of  dynamic  maneuvers. 

A  complete  trajectory  is  constructed  by  updating  the  aircraft  state  in  1  s  intervals.  Within 
each  interval,  the  three  derivative  variables  h.  and  v  are  treated  as  target  values  and  held 
constant.  A  dynamic  model  is  used  to  compute  and  update  the  aircraft  state  at  each  time  step  based 
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on  these  piece  wise-const  ant  target  values.  The  dynamic  model  is  independent  of  this  encounter 
model  and  not  discussed  in  this  report,  but  a  process  for  validating  an  encounter  model  against  the 
one  used  at  Lincoln  Laboratory  is  described  in  Appendix  E. 
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(a)  Initial  distribution. 


(b)  Transition  distribution 


Figure  4 •  Bayesian  networks  representing  the  variable  dependency  structure  for  the  initial  and  transition 
distributions. 
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3.  ESTIMATION 


At  a  high  level,  the  modeling  process  involves  taking  in  a  large  volume  of  radar  data  (spanning 
two  weeks  from  more  than  120  sensors)  and  carefully  filtering  and  processing  that  data  to  extract 
features  of  aircraft  trajectories.  Features  include  static  variables  that  specify  an  encounter  (such 
as  altitude  layer  or  initial  airspeed)  and  multiple,  dynamic  variables  that  describe  aircraft  motion 
leading  up  to  and  through  the  closest  point  of  approach  (such  as  turn  rate,  vertical  rate,  and 
airspeed  acceleration  every  second).  To  aid  in  data  processing,  each  feature  was  quantized  into 
several  bins  and  counts  were  taken  of  the  frequency  with  which  each  bin  was  occupied  by  radar 
data.  Based  on  these  counts  (termed  sufficient  statistics),  probability  tables  were  then  constructed 
so  that  each  feature  can  be  randomly  generated  such  that  the  overall  geometries  and  dynamics  are 
representative  of  the  actual  events  observed  in  the  radar  data. 

Accordingly,  the  inputs  to  this  process  are  raw  radar  reports  (range,  azimuth,  altitude,  time) 
and  the  outputs  are  probability  tables  that  specify  the  likelihood  that  a  given  feature  will  take 
on  a  value  within  a  bin  corresponding  to  each  table  cell.  This  section  describes  the  processing 
required  to  transform  radar  tracks  into  sufficient  statistic  s  that  may  be  used  to  model  uncorrelated 
encounters.  Figure  5  outlines  the  multiple-stage  feature  extraction  process. 

To  build  the  model,  we  collected  VFR  (1200-code)  beacon  reports  between  December  1 
7.  2007.  and  between  June  1  7.  2008  amounting  to  74,000  flight  hours  after  fusing  tracks  seen 
by  multiple  radars.  Appendix  0  describes  tests  to  ensure  that  the  two  weeks  used  to  build  the 
uncorrelated  model  were  more  than  sufficient  to  build  a  statistically- valid  model  for  uncorrelated 
encounters.  In  addition,  nine  full  months  of  radar  data  were  archived  and  used  to  build  distributions 
of  VFR  traffic  density  across  the  United  States.  A  separate  correlated  model  also  developed  by 
Lincoln  Laboratory  required  processing  a  full  nine  months  of  raciar  data  because  encounters  between 
two  aircraft  are  relatively  rare  and  because  the  correlated  model  has  more  dependencies  between 
variables. 

The  raw  radar  data  are  first  processed  using  a  tracking  algorithm  developed  at  Lincoln  Lab¬ 
oratory  [13].  A  fusion  algorithm,  also  developed  at  Lincoln  Laboratory  [14],  then  fuses  tracks  from 
multiple  sensors  to  give  one  global  view  of  all  the  tracks  in  U.S.  airspace  (see  Appendix  F).  We  elim¬ 
inated  tracks  that  had  fewer  than  ten  scans.  We  found  that  approximately  ten  scans  are  required  to 
accurately  estimate  the  various  maneuver  rates.  We  also  eliminated  tracks  if  any  of  their  associated 
reports  were  inside  Special  Use  Airspace  whose  boundaries  are  defined  in  the  Digital  Aeronautical 
Flight  Information  File  (DAFIF),  8th  Edition,  managed  by  the  National  Geospatial-Intelligence 
Agency  (NGA). 


3.1  OUTLIER  REMOVAL 

The'  first  step  in  processing  the  raw  radar  tracks  is  to  detect  and  remove  outliers  In  the  horizontal 
plane,  we  remove  jumps  with  ground  speeds  above  600 kt  using  the  following  algorithm.  We  begin 
by  estimating  the  speed  between  each  sample  point  by  dividing  the  distance  between  samples  by 
the  time  interval  between  samples.  Samples  on  either  side  of  segments  where  the  speed  is  above 
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Figure  5.  Estimation  process  flow. 
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the  threshold  of  600 kt  are  stored  in  a  list  of  candidates  for  removal.  We  iterate  through  the  list 
of  candidates  and  remove  the  one  that  minimizes  the  sum  of  speeds  above  the  set  threshold.  The 
process  repeats  until  there  are  no  longer  any  segments  with  speeds  above  the  threshold 

In  the  vertical  plane,  we  remove  missing  Mode  C  altitude  reports  from  consideration.  Then, 
using  the  same  process  as  used  for  the  horizontal  plane,  we  remove  outliers  with  vertical  rates 
greater  than  5000 ft /min  or  less  than  — 5000ft/min.  We  remove  altitude  reports  that  come  before 
the  first  position  report  or  after  the  last  position  report  to  prevent  extrapolation.  We  also  remove 
position  reports  that  come  before  the  first  altitude  report  or  after  the  last  altitude  report,  also  to 
prevent  extrapolation.  After  outlier  removal,  we  discard  tracks  with  fewer  than  ten  valid  scans. 


3.2  TRACK  SMOOTHING 


After  removing  any  outliers  from  a  track,  we  smooth  the  remaining  data  points,  first  horizontally 
and  then  vertically.  We  use  the  same  smoothing  scheme  for  both  horizontal  and  vertical  smooth¬ 
ing.  We  use  the  following  general  formula  to  transform  a  raw  trajectory  (t\>X\) . (tn<xn)  to  a 

smoothed  trajectory  . . . ,  yn . 


Ej  w(*j'  0)^ 

ijwiU.tj) 


(i) 


where  is  a  weighting  function  that  monotonioalh  decreases  as  the  difference  between  /? 

and  tj  increases.  For  the  weighting  function,  we  use  the  following  definition  based  on  a  Gaussian 
kernel  with  standard  deviation  a. 


When  smoothing  horizontally,  we  use  o  —  5  s.  When  smoothing  vertically,  we  use  a  —  15  s.  A  larger 
a  is  required  for  vertical  smoothing  because  of  1 00 ft  Mode  C  quantization.  We  chose  these  values 
for  a  after  trying  different  standard  deviations  on  a  sampling  on  horizontal  and  vertical  profiles  in 
our  data  set;  the  chosen  values  preserve  the  underlying  tracks  while  removing  noise. 


3.3  INTERPOLATION 

The  time  interval  between  radar  scans  in  the  data  is  much  longer  than  the  1  s  time  step  of  the 
dynamic  Bayesian  network.  Terminal  (short  range)  radars  scan  aircraft  approximately  every  5 
seconds,  and  en  route  (long  range)  radars  scan  aircraft  every  10  to  12  seconds.  Additionally,  it  is 
common  for  sensors  to  skip  one  or  more  consecutive  scans  of  a  target  and  some  scans  produce  outliers 
that  we  remove  (Section  3.1).  Hence,  we  need  to  use  interpolation  to  estimate  the  parameters  in 
our  dynamic  Bayesian  network.  We  chose  a  piecewise-cubic  Hermite  interpolation  scheme  that 
preserves  monotonicity  and  shape  [15]. 

Figure  6  shows  the  result  of  outlier  detection,  smoothing,  and  interpolation  on  an  example 
track  from  the  radar  data  set.  Figure  7  shows  the  result  of  piecewise-cubic  Hermite  interpolation 
on  a  different  example  smoothed  track. 
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Figure  6.  Preprocessing.  Blue  lines  show  an  example  raw  track .  Red  lines  show  the  track  after  outlier 
removal  smoothing ,  and  interpolation. 


East  (NM) 


Figure  7.  Piecewise- cubic  Hermite  interpolation  on  an  example  smoothed  truck.  Red  crosses  indicate 
smoothed  data  points ,  and  the  blue  curve  shows  the  interpolation. 


3.4  FEATURE  EXTRACTION 

Feature  extraction  involves  converting  an  interpolated  track  into  sequences  of  quantized  bins  rep- 
resenting  airspace  classes,  altitude  layers,  airspeeds,  vertical  rates,  turn  rates,  and  accelerations. 


•  Airspace  class:  We  estimate  latitude  and  longitude  of  radar  returns  using  an  algorithm 
developed  at  Lincoln  [16].  From  altitude  estimates  and  latitude  and  longitude  estimates,  we 
determine  the  class  of  airspace  by  searching  through  the  National  Airspace  System  Resources 
(NASR)  database  provided  by  the  FAA.  Since  the  altitude  estimates  are  based  on  Mode 
C  reports  of  pressure  altitude,  uncorrected  for  barometric  variation,  it  is  possible  that  the 
airspace  of  some  tracks  are  identified  incorrectly.  We  expect  that  this  limited  inaccuracy  in 
airspace  class  identification  due  to  barometric  variation  has  a  negligible  impact  on  our  model. 
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•  Altitude  layer:  Altitude  above  ground  level  (AGL)  determines  the  altitude  layer  in  the 
model.  We  estimate  altitude  AGL  by  subtraeting  ail  estimate  of  ground  elevation  from 
pressure  altitude.  Our  estimates  of  ground  elevation  come  from  Digital  Terrain  Elevation  Data 
(DTED)  provided  by  the  National  Geospatial-Intelligence  Ageney  (NGA).  We  use  DTED 
Level  0,  which  has  post  spacing  of  30  arc-seconds  (approximately  900  meters). 

•  Airspeed:  The  true  airspeed  at  time  t  is  estimated  by 

r(t)  =  \J (,r(/  +  1)  -  7W2  +  (V(t  +  1)  -  vUW2  +  (Ht  +  1)  -  MO)2  • 

•  Vertical  rate:  The  vertical  rate  is  estimated  from  the  smoothed  and  interpolated  altitudes 
estimated  from  Mode*  C  reports.  The  vertical  rate  at  time  t  is  given  by  h(t)  =  h(t  +  1)  —  h(t)- 

•  Turn  rate:  We  first  compute  the  heading  along  tlu*  interpolated  track.  The  heading  at  time 

t  is  given  by  0(t)  and  corresponds  to  the  direction  from  to  (.;•(/  -1-  1  ).y(t  +  1)). 

To  compute  the  turn  rate  at  time  f.  we  find  the  acute  change  in  heading  between  v(0  and 
ij)(t  -|-  1).  Turns  to  the  right  have  positive  turn  rates,  and  turns  to  the  left  have  negative  turn 
rates. 

•  Acceleration:  To  find  the  acceleration  at  a  particular  point,  we  average  the  change  in 
airspeed  per  unit  time  looking  forward  one  time  step  and  looking  back  one  time  step. 

We  then  smooth  the  extracted  features  using  the  same  smoothing  scheme  we  used  for  tracks 
(Section  3.2).  For  turn  rate,  airspeed,  and  acceleration,  we  set  a  to  10 s.  20s,  and  20s  respectively. 
We  chose  these  numbers  large  enough  so  that  noise  is  removed  from  the  measurements  but  low 
enough  so  that  the  underlying  properties  of  the  maneuvers  are  not  lost.  We  do  not  smooth  vertical 
rates  in  this  step  because  the  altitudes  are  already  smoothed  (Section  3.2). 

In  order  to  be  modeled  by  a  discrete  Bayesian  network,  it  is  necessary  to  quantize  the  features. 

We  quantize  continuous  values  by  defining  a  sequence  of  eut  points  ci . cn.  Values  less  than  c\  are 

in  the  first  bin,  values  greater  than  cv  are  in  the  (n  +  l)th  bin,  and  values  in  the  half-open  interval 
[r7  pc*-)  are  in  the  ith  bin.  The  cut  points  we  used  for  quantization  are  listed  in  Table  2.  For 
example,  referring  to  Table  2.  all  airspeed  values  less  than  30  kt  are  placed  into  one  bin:  airspeeds 
between  30-60  kt  are  placed  in  a  different  bin,  etc.  The  cut  points  were  chosen  to  capture  the 
variation  of  the  features  as  shown  in  the  histograms  in  Figure  8. 

Figure  9  shows  the  result  of  feature  extraction  on  the  same  track  shown  in  Figure  6. 

3.5  STATISTICS  EXTRACTION 

With  structures  for  the  initial  and  transition  distributions  and  the  quantized  features  from  a  set 
of  tracks,  we  are  able  to  collect  the  sufficient  statistics  to  estimate  the  parameters  for  our  model. 
For  the  two  Bayesian  networks,  the  sufficient  statistics  are  simply  the  counts  Njjk  of  the  various 
features  (see  Appendix  l).2  Appendix  A  describes  the  sufficient  statistics  extracted  from  the  data. 

2  The  counts  are  called  sufficient  statistics  because  they  provide  a  summarization  of  the  data  that  is  sufficient  to 
compute  the  posterior  distribution  from  the  prior.  For  an  introduction  to  Bayesian  statistics,  see  [17]. 
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TABLE  2 


Cut  points  used  for  feature  quantization. 


Cut  Points 
A  B,  C,  D,  0 
L  1200,3000,5000 
v  30, 60, 90, 120. 140, 165, 250 
v  — 1,  —0,25, 0.25, 1 
h  -1250,  -750,  -250, 250, 750, 1250 
tP  —6,  —4.5,  —1.5, 1.5, 4.5, 6 


0  100  200  300  400 


-3-2-10  1  2  3 


Airspeed  (kt) 


Acceleration  (kt/s) 


-4000  -2000  0  2000 

Vertical  rate  (ft /min) 


4000  -10 


-5 


0 


10 


Fum  rate  (deg/s) 

Figure  8 .  Feature  histograms  of  recorded  radar  data  based  on  193  million  samples. 
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Vertical  rate  (ft/ min) 


Time  (s)  Time  (s) 


Figure  9.  A  plot  of  extracted  features  over  time.  Blue  lines  show  features  before  smoothing,  and.  red  lines 
show  features  after  smoothing.  The  green  blocks  show  the  bins  to  which  the  features  are  assigned. 
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The  counts  Nijk  are  then  compiled  into  a  separate  table  for  each  variable.  For  example,  the 
table  for  altitude  layer  L  (which  is  itself  dependent  on  airspace  class  A)  is  shown  in  Table  3.  Each 
cell  entry  in  the  table  represents  the  counts  for  that  combination  of  parent  bin  A  (B,  C,  D,  O) 
and  bin  of  L  ([500,1200),  [1200,3000),  [3000,5000),  [5000,  oo)).  An  electronic  file  available  from 
Lincoln  Laboratory  contains  all  of  the  data  required  to  construct  these  tables. 


TABLE  3 


N(L  |  A) 


A 

L 

(500.  1200) 

[1200.  3000) 

[3000,  5000) 

[5000,  oo) 

B 

265497 

315803 

197208 

64204 

O 

500465 

402396 

67303 

59 

D 

11209219 

4496306 

688 

0 

O 

41043613 

87491669 

31033237 

16457213 

To  ensure  a  sufficiently  large  data  set  was  used,  Figure  10  illustrates  the  convergence  of 
the  Bayesian  network  parameters  (probabilities)  as  additional  data  are  added.  The  horizontal  axis 
represents  the  number  of  samples  used  to  estimate  the  parameters  of  the  Bayesian  network,  and  the 
vertical  axis  represents  the  maximum  difference  between  any  elements  in  the  conditional  probability 
tables  with  the  addition  of  more  data.  As  the  figure  shows,  the  change  in  the  probabilities  in 
the  conditional  probabilities  converged  to  much  less  than  0.0001  given  the  amount  of  radar  data 
collected  over  the  span  of  two  weeks. 

To  summarize,  this  section  describes  the  process  used  to  construct  a  model  of  uncorrelated 
aircraft  trajectories  based  on  radar  data.  Raw  radar  data  were  processed  through  multiple  filtering 
and  tracking  stages  and  then  digested  into  a  series  of  probability  tables  organized  within  Bayesian 
networks.  Each  table  cell  represents  the  probability  of  a  particular  feature  taking  on  a  value  within 
a  certain  quantized  bin.  The  next  section  describes  how  to  use  these  tables  to  generate  random, 
but  statistically  representative,  trajectories  through  sampling.  An  example  of  producing  encounter 
trajectories  using  the  model  is  given  in  Appendix  B. 
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Figure  10.  Convergence  curve  for  Bayesian  network  conditional  probabilities. 
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4.  SAMPLING 


Once  the  data  have  been  processed  as  described  in  the  previous  section,  one  can  use  the 
model  structure  and  sufficient  statistics  to  produce  new  trajectories  that  are  representative  of  the 
ones  observed  by  radar.  The  first  step  involves  sampling  from  the  discrete  Bayesian  network  tables 
representing  the  initial  and  transition  distributions.  This  provides  a  series  of  bins  that  represent 
coarse  values  to  be  used  for  each  feature.  The  second  step  involves  converting  the  coarse,  discrete 
samples  into  fine,  continuous  samples  by  sampling  within  bins.  Figure  11  illustrates  this  process. 


Figure  11.  Sampling  process  flow. 


4.1  DISCRETE  SAMPLING 

We  begin  by  sampling  from  the  Bayesian  network  representing  the  initial  state  distribution.  Re¬ 
ferring  back  to  Figure  4a.  first  one  would  randomly  sample  from  the  table  describing  Ar(A)  to 
determine  the  bin  to  use  for  airspace  class  A.  Given  that  bin  for  airspace  class,  one  would  next 
sample  from  the  table  N(L  \  A)  to  determine  the  bin  for  altitude  layer  L  and  so  on.  See  Appendix  B 
for  a  more  detailed  description  of  this  process. 

Formally,  consider  sampling  from  a  table  describing  the  /th  variable  Xt  (e.g.,  where  X7  = 
altitude  layer).  As  was  shown  in  Table  3.  each  table  contains  a  series  of  bins  k  (e.g.,  [500, 1200), 
[1200,  3000).  [3000,5000).  [5000,  oc))  whose  likelihood  depends  on  each  parent  bin  j  (e.g.,  R.  C.  D, 

O). 


25 


I 


The  probability  assigned  to  bin  k  is  then  given  by 

aijk  +  Njjk  . 

Y2'=\(aijk'  +  Nijk') 

where 

•  j  is  the  instantiation  of  the  parents  of  Xj  in  the  Bayesian  network. 

•  Nijk  is  the  number  of  times  Xr  was  equal  to  k  when  its  parents  were  instantiated  to  j  in  the 

data, 

•  is  a  Dirichlet  prior  parameter,  and 

•  is  the  number  of  ways  to  instantiate  the  parents  of  X{. 

We  use  a{jk  =  1,  which  corresponds  to  the  prior  assumption  that  all  combinations  of  relative 
frequencies  for  k  are  equally  probable.  Sampling  from  the  posterior  distribution  with  a^k  =  1  is 
equivalent  to  adding  1  to  all  the  counts  in  the  tables  in  Appendix  A  and  sampling  according  to  the 
resulting  relative  frequencies.  This  ensures  that  there  are  no  transitions  with  zero  probability  in 
the  Markov  model. 

For  instance,  referring  to  Table  3,  if  airspace  class  A  had  previously  been  selected  to  be  D, 
then  the  probability  of  selecting  altitude  layer  L  =  [500, 1200)  would  be  P(L  =  [500, 1200))  = 
(1  +  1 1209219)/ (1  +  11209219  +  1  +  4496306  +  1  +  688  +  1  +  0)  =  0.7137. 

Once  we  have  the  initial  state  of  the  trajectory,  we  can  sample  from  the  dynamic  Bayesian 
network  representing  how  the  state  changes.  We  fix  h(t),  and  v(t ),  and  then  use  the  standard 

Bayesian  network  sampling  scheme  to  determine  h(t  +1),  +1),  and  v(t  +1).  The  process  may 

be  repeated  for  as  long  as  we  wish  to  run  the  trajectory. 

Our  sample  from  the  Bayesian  network  tells  us  into  which  bins  airspeed,  vertical  rate,  turn 
rate,  and  acceleration  fall.  We  then  sample  within  the  bins  as  discussed  in  Section  4.2. 

4.2  CONTINUOUS  SAMPLING 

To  produce  a  continuous  sample  given  a  coarse,  discrete  sample  from  the  initial  distribution,  we 
simply  sample  uniformly  within  the  bins.  For  example,  if  we  determine  that  the  initial  airspeed  is 
within  the  bin  [60.  90),  we  simply  sample  from  the  uniform  distribution  over  the  half-open  interval 
[60, 90).  Because  the  first  and  last  bins  associated  with  each  interval  are  unbounded,  it  is  necessary 
to  impose  some  bounds.  Table  4  shows  the  quantization  boundaries  based  on  the  limits  observed 
in  the  radar  data. 

With  regard  to  the  transition  network,  instead  of  sampling  at  every  time  step  within  bins 
for  turn  rate,  vertical  rate,  and  airspeed  acceleration,  we  only  produce  a  new  sample  with  a  fixed 
probability  per  time  step  that  has  been  estimated  from  the  radar  data.  This  prevents  numerous 
minor  changes  and  allows  relatively  steady-state  flight  conditions  to  occur  at  the  same  frequency 
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TABLE  4 


Sampling  boundaries. 


Boundaries 

7i  500. 1200,  300(h  5000. 12500 

v  0,  30, 60, 90.  120. 140.  165. 250. 300 
v  -2, -1.-0.25,0.25.1.2 
h  -2000.  -1250.  -750.  -250. 250.  750.  1250.  2000 
0  -8,  -6,  -4.5,  - 1 .5. 1.5,  4.5,  6.  8 


as  observed  in  the  radar  data.  The  within-bin  transition  rates  we  used  are  0.0200.  0.0395.  and 
0.0875  changes  per  second  for  acceleration,  vertical  rate,  and  turn  rate  respectively.  We  estimated 
these  rates  from  the  data  by  introducing  3  smaller  bins  within  each  bin  and  computing  the  relative 
frequency  that  tracks  stay  within  a  single  smaller  bin  (versus  moving  to  another  small  bin  within 
the  same  coarse  bin).  A  similar  strategy  was  used  by  Eurocontrol  for  their  cooperative  encounter 
model. 

When  producing  continuous  samples  from  bins  that  include  zero  in  their  range,  we  simply 
produce  zero  instead  of  sampling  uniformly  in  order  to  prevent  very  small  vertical  rates,  turn  rates, 
and  acceleration. 

Figure  12  plots  an  example  of  vertical  rates,  turn  rates,  accelerations,  and  airspeeds  generated 
by  sampling  from  the  Bayesian  networks.  First,  coarse  bins  are  selected  randomly  and  shown  as 
green  bars.  Next,  fine  values  within  each  bin  are  selected  and  shown  in  blue.  Note  that  vertical 
rate,  for  example,  is  held  to  precisely  zero  when  the  bill  spans  zero,  and  otherwise  the  vertical  rate 
may  be  reselected  several  times  within  11011-zero  bins  at  the  mean  rate  described  above.  Figure  13 
shows  the'  resulting  track  produced  by  the  sampled  features  shown  in  Figure  12.  Translation  of 
features  into  tracks  involves  running  a  discrete- time  simulation,  described  in  the  next  section. 
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Figure  12.  A  plot  of  sampled  features  over  time.  Green  blocks  show  the  coarse  discrete  bins  and  blue  lines 
show  the  resulting  fine  samples  within  those  bins. 


Figure  13.  A  track  generated  by  sampling  from  the  initial  and  transition  distributions.  Positions  and  altitudes 
are  shown  relative  to  the  initial  position  and  altitude. 
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5.  SIMULATION 


Earlier  sections  explained  how  to  build  a  dynamic  probabilistic  model  of  aircraft  and  how  to 
sample  from  the  model  to  produce  nominal  trajectories  that  are  representative  of  what  we  observed 
in  the  radar  data.  The  same  modeling  techniques  discussed  earlier  may  be  used  to  build  a  model 
of  the  manned  or  unmanned  aircraft  with  the  collision  avoidance  system  to  be  evaluated.  This 
section  explains  how  to  combine  a  model  of  the  aircraft  with  the  collision  avoidance  system  to  be 
evaluated,  which  we  call  AC1,  with  a  model  of  an  intruder,  which  we  call  AC2,  into  a  simulated 
encounter. 

The  trajectory  for  AC1  may  be  specified  by  the  analyst  (e.g.,  to  focus  on  a  particular  phase 
of  flight),  based  on  actual  flight  paths  from  mission  planning  or  radar  data,  or  randomly  generated 
using  a  statistical  model  representative  of  that  aircraft  s  typical  flight  profiles.  In  a  study  of 
conventional  VFR-on-VFR  encounters,  for  example,  trajectories  for  both  AC1  and  AC2  could  be 
generated  using  the  uncorrelated  encounter  model  described  here. 

We  avoid  simulating  encounters  that  are  extremely  unlikely  to  result  in  NMAOs  by  focusing 
computational  effort  on  encounters  that  occur  in  an  encounter  cylinder  centered  on  ACL  AC2  is 
initialized  at  a  random  location  on  the  surface  of  the  encounter  cylinder  and  the  dynamic  models 
are  used  to  update  the  states  of  AC1  and  AC2  over  time.  If  the  collision  avoidance  system  on  AC1 
commands  an  avoidance  maneuver,  it  overrides  the  nominal  rates  suggested  by  the  dynamic  model. 
If  AC2  enters  an  NMAC  cylinder  or  exits  the  encounter  cylinder,  the  encounter  run  is  terminated. 
The  NMAC  cylinder  has  radius  rnmar  and  height  2 hnma/c.  We  use  rnmac  =  500  ft  and  /imiiar  =  100  ft. 


5.1  ENCOUNTER  CYLINDER  DIMENSIONS 

The  encounter  cylinder  has  radius  renc  and  height  2//enc.  The  appropriate  dimensions  of  the  en¬ 
counter  cylinder  depend  on  the  airc  raft  dynamics  and  collision  avoidance  system.  If  the  encounter 
cylinder  is  too  small,  the  collision  avoidance  system  will  not  have  enough  opportunity  to  be  fully 
exercised  before'  a  collision.  If  the  encounter  cylinder  is  too  large,  then  computation  is  wasted. 

An  upper  bound  for  renc  is  the  amount  of  time  required  by  the  collision  avoidance  system 
to  detect,  track,  and  avoid  a  target  multiplied  by  the  sum  of  the  maximal  airspeeds  of  AC1  and 
AC2.  An  upper  bound  for  henc  is  the  amount  of  time  required  by  the  collision  avoidance  system  to 
detect,  track,  and  avoid  a  target  multiplied  by  the  sum  of  the  maximal  vertical  rate  magnitudes  of 
AC1  and  AC2. 


5.2  INITIAL  CONDITIONS 

We  use  rejection  sampling  to  generate  the  initial  conditions  of  an  encounter.  Rejection  sampling 
involves  proposing  a  series  of  candidate  samples  from  a  random  distribution  until  choosing  one 
that  meets  a  set  of  criteria  (summarized  in  Figure  14).  The  process  we  use  for  generating  initial 
conditions  for  encounters  is  as  follows: 


2fi 


R  ■  (v2  -  t?i)  <  0 


Figure  If •  Depending  on  where  on  the  encounter  cylinder  the  intruder  is  initialized ,  different  criteria  must 
be  met  in  order  to  accept  a  given  trajectory  for  further  simulation. 
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Figui'e  15.  Variables  relevant  to  rejection  sampling  in  the  horizontal  plane. 
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1.  Generate  airspeeds,  vertical  rates,  turn  rates,  and  accelerations  for  AC1  and  AC2  according  to 
their  models  such  that  they  belong  to  the  same  airspace  class  and  altitude  layer.  Forcing  this 
constraint  can  be  done  using  rejection  sampling.  Simply  generate  AC1  and  AC2  independently 
and  reject  both  if  they  have  a  different  airspace  class  or  altitude  layer. 

2.  Initialize  the  position  of  AC2  on  the  surface  of  the  encounter  cylinder  centered  on  ACl. 
AC2  may  be  initialized  on  the  top.  bottom,  or  side  surfaces  of  the  encounter  cylinder.  The 
probability  of  being  located  on  the  top.  bottom,  or  side  is  proportional  to  the  surface  area 
of  each.  Once  top.  bottom,  or  side  has  been  selected.  AC2  is  randomly  positioned  using  a 
two-dimensional  uniform  distribution  across  that  surface.  The  bearing  of  AC2  relative  to 
ACl  is  denoted  \. 

3.  The  heading  of  ACl  is  set  to  0.  The  heading  of  AC2,  denoted  0,  is  randomly  selected  from 
a  uniform  distribution  over  [0, 27r). 

4.  If  AC2  was  initialized  on  the  top  of  the  encounter  cylinder,  accept  the  sample  if  the  vertical 
rate  of  AC 2  relative  to  ACl.  denoted  hreK  is  negative.  This  ensures  that  AC2  is  penetrating 
the  encounter  cylinder  for  the  first  time. 

5.  If  AC2  was  initialized  on  the  bottom  of  the  encounter  cylinder,  accept  the  sample  if  the 
vertical  rate  of  AC2  relative  to  ACl.  denoted  /?rel,  is  positive.  This  ensures  that  AC2  is 
penetrating  the  encounter  cylinder  for  the  first  time. 

6.  If  AC2  was  initialized  on  the  side  of  the  encounter  cylinder,  accept  the  sample  if  R-  (i>2  —  v\) 
is  negative,  where  R  =  (sin  \.cos\)  is  the  unit  vector  from  ACl  to  AC2.  The  vectors  v\  and 
V2  are  the  ground  velocities  of  ACl  and  AC2  respectively.  When  R  •  (v2  ~  tp)  is  negative, 
the  relative  velocity  of  AC2  is  into  the  encounter  cylinder,  and  therefore  should  be  accepted. 
See  Figure  15. 

The  process  is  repeated  until  a  candidate  initialization  is  accepted. 

A  byproduct  of  rejection  sampling  is  that  the  intruder  bearing  distribution  is  nonuniform. 
As  one  would  expect,  more  encounters  occur  head  on  than  from  the  side  or  rear.  Figure  16  shows 
bearing  distributions  for  two  conventional  VFR  aircraft. 

5.3  TRAJECTORY  CONSTRUCTION 

Once  the  initial  conditions  are  selected,  the  dynamic  models  of  ACl  and  AC2  are  used  to  update 
their  trajectories  over  time. 

Given  the  initial  state  of  an  aircraft  (including  at  a  minimum:  position,  altitude,  heading, 
climb  rate,  turn  rate,  and  velocity)  and  its  control  variables  (/?,  ij\  and  v)  the  aircraft  state  is 
updated  in  <  1  s  time  steps  using  a  dynamic  model.  Due  to  the  wide  variety  of  possible  dynamic 
model  implementations  [18],  details  for  computing  this  state  update  are  not  provided  here.  A  basic 
approach  would  be  to  apply  simple  point-mass  kinematics  to  update  the  aircraft  state  without 
considering  what  the  aircraft  may  actually  be  doing  in  terms  of  bank  angle,  pitch  rate,  etc.  More 
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Figure  16.  Initial  bearing  distributions  produced  by  rejection  sampling  for  two  randomly- generated  VFR 
aircraft. 

complex  implementations  could  include  6  degree-of-freedom  dynamic  models  in  which  the  control 
variables  are  treated  as  target  states  provided  to  an  autoflight  control  system  which  then  applies 
the  necessary  control  deflections  to  attain  those  targets.  Lincoln  Laboratory’s  Collision  Avoidance 
System  Safety  Assessment  Tool  (CASSATT)  typically  uses  a  4  degree-of-freedom  model  to  update 
aircraft  state  by  applying  the  necessary  airspeed  acceleration,  roll  rate,  and  pitch  rate  to  achieve 
the  target  values  for  /i,  and  v  assuming  curvilinear  motion  with  a  zero-sideslip  constraint.  The 
zero-sideslip  constraint  can  be  relaxed,  for  instance,  if  it  is  necessary  to  model  transient  dynamics 
with  higher  fidelity.  Appendix  E  explains  how  to  validate  trajectories  produced  by  other  simulation 
software  against  trajectories  produced  by  CASSATT. 

A  simulation  run  terminates  when  the  intruder  either  exits  the  encounter  cylinder  or  pene¬ 
trates  the  NMAC  cylinder.  After  running  many  simulations,  P(nmac  |  enc)  is  estimated  by  dividing 
the  number  of  runs  that  resulted  in  an  NMAC  by  the  total  number  of  runs. 


5.4  MULTIPLE  ENCOUNTERS 

The  simulation  currently  only  handles  pairwise  encounters.  The  probability  of  an  intruder  pene¬ 
trating  the  encounter  cylinder  while  another  intruder  is  within  the  encounter  cylinder  is  likely  to 
be  very  small.  One  may  compute  this  probability  using 

POO  POO 

/  pM[1  —  eAenc*]d t  =  1  —  /  p(t)ex*nctdt ,  (4) 

Jo  Jo 

where  p(t)  is  the  distribution  over  the  amount  of  time  intruders  spend  in  the  encounter  cylinder  and 
Aenc  is  the  rate  at  which  new  intruders  penetrate  the  encounter  cylinder.  In  the  above  equation, 
1— eAenct  comes  from  the  cumulative  density  function  for  an  exponential  distribution.  The  possibility 
of  simultaneous  multiple  intruders  needs  to  be  examined  and  will  be  an  area  of  future  work. 
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6.  SAFETY  EVALUATION 


Tins  section  explains  how  to  estimate  the  NMAC  rate,  denoted  Anmac,  based  on  a  large  number 
of  simulations.  We  begin  by  observing  that 

Anmac  =  C(imiar  |  enc)Aeuc  . 

where  P(nrnac  |  enc)  is  the  probability  that  an  aircraft  that  enters  the  encounter  cylinder  penetrates 
the  NMAC  cylinder  before  exiting  the  encounter  cylinder  and  Aenc  is  the  rate  at  which  aircraft 
penetrate  the  encounter  cylinder.  The  mean  time  between  NMACs  is  simply  A“nliac. 

This  section  makes  the  following  assumptions: 

1.  the  density  of  air  traffic  outside  the  encounter  cylinder  is  uniform  in  the  local  region  being 
studied,  and 

2.  the  trajectories  of  aircraft  outside  of  the  encounter  cylinder  are  independent  of  tlu'  trajectories 
of  aircraft  within  the  encounter  cylinder, 

From  these  two  assumptions,  we  explain  how  to  compute  P(nmac.  |  enc)  and  Aenr.  Note  that 
estimating  traffic*  density  requires  a  more  focused  assessment  of  a  particular  region  and  time  period, 
beyond  the  scope  of  this  report.  However,  sufficient  radar  data  have  been  archived  at  Lincoln 
Laboratory  for  the  U.S.  to  allow  such  an  analysis  at  most  regions  (with  the  exception  of  potential 
terrain  masking  of  radar  reports  in  certain  areas). 

G.l  ESTIMATING  NMAC  PROBABILITY 

Section  5.2  explained  how  to  construct  an  encounter  from  two  independent  trajectories  sampled 
from  the  distribution  represented  by  Bayesian  networks.  By  generating  a  large  collection  of  en¬ 
counters  and  determining  which  encounters  lead  to  NMACs.  we  can  estimate  P(nmae  |  enc). 
Unfortunately,  we  cannot  simply  divide  the  number  of  sampled  encounters  that  lead  to  NMACs  by 
the  total  number  of  sampled  encounters  to  estimate  P(nmae  |  enc)  due  to  the  fact  that  our  sampling 
scheme  does  not  produce  encounters  from  the  same  distribution  that  would  occur  in  the  airspace. 
In  particular,  we  generate  encounters  with  aircraft  velocities  distributed  identically  to  the  aircraft 
population  at  large,  despite  the  fact  that  in  reality  the  distribution  of  aircraft  velocities  given  that 
an  encounter  is  occurring  favors  high-speed  aircraft  .  Although  we  sample  from  a  distribution  that 
is  different  from  the  true  distribution  when  constructing  encounters,  we  can  still  use  the  samples 
to  estimate  P(nmac  |  enc)  so  long  as  we  weight  their  results  properly  using  an  approach  known  as 
importance  sampling  [19].  We  will  begin  by  stating  the  weighting  scheme  and  then  prove  that  it  is 
correct. 

Section  4  explained  how  to  generate  the  trajectories  for  AC1  and  AC2,  which  we  call  Z\  and 
Z2 ,  by  sampling  from  the  Bayesian  networks  with  the  requirement  that  both  aircraft  come  from  the 
same  airspace  class  and  altitude  layer.  Section  5.2  explained  how  to  randomly  select  the  position 
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of  AC2  relative  to  AC1.  which  we  call  xr ,  and  the  heading  of  AC2  relative  to  AC1.  which  we  call 
'll).  Importance  sampling  allows  us  to  make  the  following  approximation  based  on  N  samples 


P(nmac  |  enc)  »  —  P( 

i 

,(i)  Ji) 


lunar  z 


j!\ \  x{) \enc) 


The  weight  V(z\  ,  z\  )/V  corrects  for  the  fact  that  our  sampling  distribution  does  not  match 
the  true  distribution  of  encounter  situations.  The  function  V(z[  .  z^)  is  the  average  volume 
the  encounter  cylinder  sweeps  out  per  unit  time  when  AC1  follows  z ^  and  the  airspace  consists 
exclusively  of  aircraft  following  z^\  In  particular, 
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where  Jp  is  the  relative  heading  of  AC2,  v\  and  v\  are  the  initial  ground  speeds,  and  h\  and  h 2 
are  the  initial  vertical  rates  of  z^  and  z^j\  The  constant  V  is  the  average  volume  the  encounter 
cylinder  sweeps  out  per  unit  time 


V 


=  JJ  p(zi)p{z2 


z\)V(z\.  z2)  dz](lz2  - 


Note  that  the  distribution  over  z2  is  conditional  on  zi  due  to  the  constraint  that  AC1  and  AC2 
must  belong  to  the  same  airspace  class  and  altitude  layer.  The  constant  V  can  be  estimated  using 
N  samples: 


V 


1 

N 


(5) 


Now  that  we  have  defined  our  weighting  scheme,  we  will  now  prove  that  it  is  correct.  From 
the  laws  of  probability, 


P(nmac  |  enc)  —  JJ  P(nmac  |  zi,Z2,enc)p(zi,Z2  |  enc)dzidz2 

=  JJJJ  ^(lllllac  I  zi*  z2,  ^  enc)p(/0.  xr  |  zi,  Z2,enc)p(zit  Z2  |  enc)  dz \dz2d'ipdxr  . 


We  may  approximate  P(nmac  |  enc)  using  Monte  Carlo  sampling.  Since  it  is  difficult  to 
sample  from  p(z\.Z2  |  enc)  directly,  we  sample  z\  and  Z2  from  the  distribution  represented  by  our 
Bayesian  network  subject  to  the  constraint  that  both  aircraft  come  from  the  same  airspace  class 
and  altitude  layer,  and  weight  the  samples  appropriately: 


P(nmac  |  enc) 
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Wo  know  that 


We  may  normalize  to  obtain 


We  may  substitute  and  simplify  to  obtain 
P(mnac  |  enc)  » 


which  corresponds  to  t lie  weighting  scheme  we  defined. 

6.2  ESTIMATING  ENCOUNTER  RATE 

Estimating  the  encounter  rate  requires  knowing  the  density  of  traffic  outside  the  encounter  cylinder. 
This  density  p.  can  be  expressed  in  aircraft  per  NM3.  The  rate  at  which  aircraft  enter  the  encounter 
cylinder  is  the  product  of  p  and  the  average  volume  of  new  airspace  the  encounter  cylinder  sweeps 
through  per  unit  time,  V .  which  was  discussed  in  Section  6.1: 


^enc  —  pV  • 


(6) 


Appendix  H  plots  values  for  p  for  VFR  (1200-code)  traffic  over  the  continental  United  States.  The 
highest  density  area  covers  Miami-Dade  County  and  has  a  density  of  approximately  0.003  aircraft 
per  NM3  over  all  altitude  layers  between  500  ft  AGL  and  FL180.  This  density  is  an  average  over  the 
data  collected  from  1-7  December  and  1  7  June.  It  is  important  to  note  that  the  density  during 
the  day  is  significantly  higher  than  at  night.  Hence,  if  one  is  interested  in  estimating  collision  risk 
due  to  VFR  aircraft  when  flying  a  particular  unmanned  aircraft  during  the  day,  for  example,  then 
one  must  use  a  value  for  p  that  is  specific  to  the  expected  hours  of  operation  so  that  collision  risk 
is  not  underestimated. 

The  V  in  Equation  6  depends  upon  the  size  of  the  encounter  cylinder  and  on  the  average 
velocities  of  aircraft  involved  in  encounters.  As  one  example,  an  estimate  of  V  was  obtained  em¬ 
pirically  by  generating  1  million  encounters  from  the  uncorrelated  encounter  model  and  applying 
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the  approximation  in  Equation  5.  If  the  encounter  cylinder  has  radius  5NM  and  height  3000ft, 
then  V  was  found  to  be  approximately  937NM3/hr  for  trajectories  generated  from  the  uncorrelated 
encounter  model.  The  average  VFR  traffic  encounter  rate  for  Miami-Dade  County,  for  example, 
is  then  0.003  x  937  =  2.8  encounters  per  flight  hour.  Note  that  this  does  not  imply  that  collision 
avoidance  maneuvering  would  necessarily  occur  2.8  times  per  hour  because  not  all  encounters  re¬ 
quire  maneuvering.  Most  of  these  encounters  dissipate  benignly,  but  it  is  important  to  simulate 
them  to  ensure  the  collision  avoidance  system  does  not  induce  problems.  Further,  because  this 
example  density  estimate  was  based  on  an  average  over  two  weeks,  night  and  day,  at  all  altitudes 
between  500  to  18,000  ft,  it  is  important  to  reiterate  the  need  to  obtain  density  values  for  specific 
operating  altitudes,  airspaces,  and  times  to  get  a  more  accurate  estimate  of  the  encounter  rate 
for  a  particular  operational  concept.  Filially,  encounter  rates  for  cooperative  traffic  and  for  non- 
cooperative  unconventional  aircraft  also  need  to  be  considered  to  obtain  an  overall  collision  risk 
estimate. 


We  may  compute  AIimac  as  follows: 
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7.  SUMMARY 


This  report  presents  a  new  approach  to  encounter  modeling  that  allows  for  the  generation  of 
more  realist  ic  encounters  than  previous  models.  The  approach  involves  modeling  the  dynamics  of 
aircraft  state  based  on  a  Markov  model  where  the  probability  of  the  next  state  depends  only  upon 
the  current  state.  One  way  to  represent  a  Markov  model  is  with  an  exhaustive  state-transition 
matrix  that  specifies  the  probability  of  transitioning  between  all  pairs  of  states.  Unfortunately,  the 
number  of  independent  parameters  required  to  define  the  matrix  grows  super-exponentially  with 
the  number  of  variables  defining  the  model.  The  more  independent  parameters  there  are  in  the 
model,  the  more  data  one  needs  to  properly  estimate  their  values.  However,  by  using  dynamic 
Bayesian  networks,  we  leverage  conditional  independence  between  some  variables  to  greatly  reduce 
the  number  of  parameters.  We  learn  the  structure  of  the  dynamic  Bayesian  network  by  maximizing 
the  posterior  probability  of  the  network  structure  given  the  data. 

The  work  presented  in  this  report  has  focused  on  a  model  where  the  trajectories  of  the  aircraft 
involved  in  the  encounter  are  independent  of  each  other  prior  to  intervent  ion  by  a  collision  avoidance 
system,  human  or  automated.  This  model  assumes  that  aircraft  blunder  into  close  proximity 
without  prior  intervention.  A  separate,  correlated  encounter  model  has  also  been  developed  for 
aircraft  under  air  traffic  control,  where  intervention  may  impact  the  way  in  which  aircraft  encounter 
each  other.  Such  a  correlated  encounter  model  is  similar  to  the  uncorrelated  encounter  model 
described  here  except  that  the  variables  defining  the  state  of  the  aircraft  in  the  encounter  are  tied 
to  each  other  in  the  dynamic  Bayesian  network.  We  are  also  developing  a  non  cooperative  model  of 
unconventional  aircraft  such  as  balloons,  ultralights,  and  gliders. 
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APPENDIX  A 
MODEL  PARAMETERS 


This  appendix  describes  the  sufficient  statistics.  Njjk  (see  Appendix  1).  used  to  estimate'  the 
conditional  probabilities  associated  with  the  initial  and  transition  distributions.  These  sufficient 
statistics  are  based  on  beacon  reports  associated  with  aircraft  squawking  VFR  (Mode  A  code  1200) 
from  1  7  December  2007  and  1  7  June  2008.  This  amounts  to  74.000  Right  hours  from  across  the 
United  States  after  track  fusion.  Other  parameters  that  are  relevant  to  generating  new  encounters 
from  the  model  are  also  described  in  this  section. 

A  text  file,  available  electronically  from  Lincoln  Laboratory,  describes  the  following  model 
parameters: 

•  Variable  labels:  A  quoted,  comma-delimited  list  specific's  the  variable  labels,  e.g..  \dot  \psi. 
as  would  be  used  by  PT^X.  There  are  different  variable  labels  for  the  initial  network  and  the 
transition  network.  The  ordering  of  the  variables  in  this  list  determines  the  ordering  of  the 
variables  in  the  other  tables.  Note  that  the  ordering  of  the  variable  labels  does  not  necessarily 
correspond  to  the  order  in  which  they  are  sampled;  a  topological  sort  may  be  necessary  before 
sampling. 

•  Graphical  structure:  A  binary  matrix  is  used  to  represent  graphical  structure.  A  1  in  the 
ith  row  and  jth  column  means  that  there  is  a  directed  edge  from  the  ith  variable  to  the  j tli 
variable'  in  the  Bayesian  network,  the  ordering  of  the  variables  are  as  defined  in  the  variable 
labels  section  of  the  file.  The  text  file  specifies  two  graphical  structures;  one  for  the  initial 
network  and  the  other  for  the  transition  network.  The  element  in  the  / tli  row  and  jth  column 
is  represented  by  G(i.j). 

•  Variable  instantiations:  For  each  network,  a  list  of  integers  specifies  the  number  of  instan¬ 
tiations  that  exist  for  each  variable. 

•  Sufficient  statistics:  For  each  network,  a  list  of  integers  specifies  the  sufficient  statistics. 
We  explain  how  to  interpret  the  array  of  integers  below. 

•  Boundaries:  The  boundaries  of  the  variable  bins  are  specified  by  a  row  of  numbers.  The 
variables  A  and  L  are  not  quantized  because  they  are  already  discrete,  and  so  boundaries  do 
not  exist.  A  *  is  used  for  these  variables. 

•  Resampling  rates:  A  list  of  numbers  specify  the  resampling  rates  (Section  4.2). 

The  list  of  numbers  describing  the  sufficient  statistics.  N}j^ ,  requires  explanation.  The  array  is 
ordered  first  by  increasing  k\  then  increasing  j ,  and  then  increasing  i.  Again,  the  variable  ordering 
is  as  defined  in  the  variable  labels  section  of  the  file.  One  way  to  load  the  sufficient  statistics  into 
memory  is  to  allocate  an  array  of  pointers  to  2-dimensional  matrices.  There  would  be  (>  matrices 
for  the  initial  network  and  9  matrices  for  the  transition  network.  The  dimensions  of  each  matrix 
is  rt  x  or  the  number  of  instantiations  of  the  variable  by  the  number  of  instantiations  of  the 


parents  of  the  variable  (see  Appendix  I).3  The  counts  may  be  read  directly  into  the  matrices  from 
the  file,  starting  with  the  first  column  of  the  first  variable  to  the  last  column  of  the  last  variable. 

Instead  of  reading  the  sufficient  statistics  into  an  array  of  matrices  stored  in  memory,  one 
('an  reference  the  elements  in  the  parameters  file  directly.  For  some  specified  variable  i,  parental 
instantiation  j,  and  variable  instantiation  k ,  the  value  Ny*  is  given  by  the  following  element  in  the 
list 

i— l 

k  +  ri(j-l)  +  ^2qiin>,  (A-l) 

i'= 1 

where  q  and  r  are  as  specified  in  Appendix  I.4 * 

It  is  important  to  clarify  the  ordering  of  the  parental  instantiations.  If  the  variables  X \ , . . . ,  Xn 
are  instantiated  to  bins  &i, . . . ,  bn,  the  parental  instantiation  of  variable  Xj  is  given  by 

3 = i + e  G^'  *■)&'  -  u  n  •  (A_2) 

i'= 1  i"= 1 

For  example,  suppose  that  a  variable  has  three  parents.  The  first  parental  instantiation  will  assign 
all  parents  to  their  first  bin.  The  second  parental  instantiation  will  assign  the  first  parent  (as 
defined  by  the  ordering  in  the  variable  labels  portion  of  the  file)  to  its  second  bin  and  the  other 
two  parents  to  their  first  bin.  The  sequence  continues  until  all  of  the  parents  are  instantiated  to 
their  last  bins. 

The  following  is  a  fragment  of  the  parameter  file.  The  lines  that  describe  the  sufficient 
statistics  are  truncated  due  to  length. 

#  labels_initial 

"A",  "L"  ,  V,  "\dot  vn,  M\dot  h" ,  M\dot  \psiM 

#  G_initial 
0  11111 
0  0  1111 
0  0  0  1  1  1 
0  0  0  0  1  1 
0  0  0  0  0  1 
0  0  0  0  0  0 

#  r.initial 
4  4  8  5  7  7 

#  N_initial 

842712  970223  15706213  176025732  265497  315803  197208  64204  . . . 

#  labels.transition 

"A",  "L",  "v",  M\dot  v(t)M ,  "\dot  h(t) 11 ,  M\dot  \psi(t)M,  "\dot  v(t+l)n , 

3 For  the  transition  network,  note  that  the  matrices  for  the  variables  that  are  not  associated  with  time  t  +  1  are 
empty. 

4In  the  transition  network,  q\  is  as  defined  in  Appendix  I  for  the  nodes  representing  variables  at  time  t  +  1.  For 

the  other  nodes,  i.e.,  the  static  variables  and  the  variables  at  time  £,  qi  is  set  to  zero. 


40 


"\dot 

h(t+l) 

m 

1 

'\dot  \psi(t+l)" 

# 

G_ 

.transition 

0 

0 

0 

0 

0 

0 

0 

1 

1 

0 

0 

0 

0 

0 

0 

0 

1 

1 

0 

0 

0 

0 

0 

0 

1 

1 

1 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

# 

r_ 

.transition 

4 

4 

8 

5 

7 

7 

5 

7 

7 

# 

N_ 

.transition 

5 

0 

0 

0 

0 

8 

0 

0 

0  0  30  5  0  0  0  35 

# 

boundaries 
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* 

0  30  60  90  120  140  165  250  300 
-2  -1  -0.25  0.25  1  2 

-2000  -1250  -750  -250  250  750  1250  2000 
-8  -6  -4.5  -1.5  1.5  4.5  6  8 
#  resample_rates 

000  0.019969  0.0394758  0.08752 
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APPENDIX  B 

TRAJECTORY  GENERATION 


This  appendix  explains  how  to  generate  an  encounter  from  the  model.  Software  for  parsing 
the  parameters  hie  (Appendix  A)  and  generating  trajectories  is  available  from  the  authors. 


B.l  INITIAL  NETWORK  SAMPLING 

The  first  step  in  generating  a  random  trajectory  using  the  model  is  to  sample  from  the  Bayesian 
network  representing  the  initial  state  distribution.  To  sample  from  a  Bayesian  network,  as  explained 
in  Appendix  I,  one  must  first  produce  a  topological  sort  of  the  nodes  in  the  network.  A  topological 
sort  orders  the  nodes  of  the  network  so  that  parents  come  before  their  descendants.  The  following 
is  the  graphical  structure  of  the  initial  network  as  specified  in  the  parameters  file  (sex'  Appendix  A) 
and  shown  in  Figure  B-l: 

0  11111 
0  0  1111 
0  0  0  1  1  1 
0  0  0  0  1  1 
0  0  0  0  0  1 
0  0  0  0  0  0 


As  can  be  seen,  this  ordering  of  the  nodes  is  already  topologically  sorted:  the  first  node 
(airspace  class)  is  connected  to  all  other  nodes.  The  second  node  (altitude  layer)  is  connected  to 
all  following  nodes,  arid  so  on.  The  final  node  (</’)  is  not  connected  to  any  other  nodes.  We  will  see 
later  that  the  transition  network,  in  contrast,  requires  a  topological  sort. 

With  the  nodes  topologically  sorted,  we  begin  by  sampling  the  first  variable.  As  specified  in 
the  parameters  file,  the  first  variable  is  A,  airspace  class.  Equation  1-5,  reproduced  below,  shows 
how  to  produce  a  random  sample: 


P(Xi  =  k  |  Kij.  D.  G) 


<> ijk  +  Ay* 

Ylkf=\{aijkf  +  Afy*' ) 


Iii  other  words,  the  probability  of  selecting  a  particular  airspace  class  is  proportional  to  the  prior 
(ay*)  plus  the  frequency  that  airspace  class  appeared  in  the  data  (Ary*).  Following  Cooper  and 
Herskovits  [20],  we  use  an  objective  prior  and  set  ay*  to  1.  To  determine  the  values  for  Ary*.  we 
need  to  look  at  the  sufficient  statistics  recorded  in  the  parameters  file.  The  sufficient  statistics 
for  the  initial  network  is  recorded  as  a  long  series  of  numbers.  We  use  the  scheme  described  in 
Appendix  A  to  determine  the  actual  values.  These  values  turn  out  to  be  the  first  four  numbers  in 
the  sufficient  statistics  sequence  and  are  shown  in  the  following  table. 


Figure  D-l.  A  graphical  representation  of  the  initial  network. 
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TABLE  B-l 


N(A) 


/I 

F3 

C 

I) 

() 

812712 

970223 

15706213 

1 76025732 

Now.  wo  may  c  ompute  the  probability  of  selecting  the  different  values  of  A  using  Equation  1-5: 

•  P(A  -  B)  =  (842712  +  1)/(842712  +  1  +970223  +  1  +  15706215  +  1  +  176025732+1)  =  0.0044 

•  P(A  =  C)  =  (970223  +  1)/(842712+  1  +970223+1  +  15706213+1  +  176025732+1)  -  0.0050 

•  P(A  =  I))  =  (15706213+ 1)/(842712  + 1+970223+ 1  +  15706213+ 1  +  176025732+1)  =  0.0812 

•  P(A  =  0)  =  (176025732+ 1)/(84271 2+ 1+970223+ 1  +  15706213+1  +  176025732+1)  =  0.9095 

We  use  a  random  number  generator  to  choose  an  airspace.  For  this  example,  let  us  choose  ().  the* 
fourth  instantiation  of  A. 

The  next  step  is  to  instantiate  the  second  variable,  L.  which  is  altitude  layer.  Choosing  a 
random  instantiation  for  L  is  not  quite  as  easy  as  for  A  because  L  depends  upon  other  variable's, 
namely  A.  We  need  to  compute  the  conditional  probability  distribution  P(L  |  A  =  ()).  i.e.,  the 
distribution  over  the  values  of  L  given  that  A  is  ().  We  consult  the  sufficient  statistics  table  N(L  \  A) 
which  is  extracted  from  the  parameters  file'  (using  the'  process  explained  in  Appendix  A)  and  is 
displayed  in  the  table1  below. 


TABLE  B-2 


N(L  |  A) 


A 

/, 

[500,  1200) 

(1200.  3000) 

(3000.  5000) 

[5000,  oo) 

B 

265497 

315803 

197208 

64201 

C 

500465 

402396 

67303 

59 

I) 

1 1209219 

1496306 

688 

0 

() 

41043613 

87491669 

31033237 

16457213 

45 


Since  we  are  only  interested  in  the  counts  associated  with  A  =  0,  we  consider  only  the  last 
row  in  the  table.  To  translate  the  counts  into  probabilities,  we  add  1  to  each  element  in  the  last  row 
and  then  divide  by  the  total  resulting  sum  of  that  row.  Again,  we  use  a  random  number  generator 
to  select  an  airspace  layer  according  to  these  probabilities.  Suppose  we  choose  layer  [500, 1200). 

The  next  variable  to  be  assigned  is  v.  We  consult  the  table  for  v,  focusing  on  the  row 
(highlighted  below)  that  is  consistent  with  the  variable  assignments  so  far,  i.e..  A  =  0  and  L  = 
[500, 1200).  We  choose  a  random  airspeed  based  on  the  counts  as  with  the  previous  variables. 


TABLE  B-3 

N{v  |  A,L) 


A 

L 

if 

[0,  30) 

[30,  60) 

[60.  90) 

[90.  120) 

(120,  140) 

[140,  165) 

[165,  250) 

[250,  300) 

B 

[500.  1200) 

13503 

31576 

81977 

95741 

26229 

10404 

3795 

2272 

C 

[500,  1200) 

16282 

32805 

183761 

197837 

41048 

18341 

8069 

2322 

D 

[500,  1200) 

128191 

583471 

4645064 

4180547 

860838 

447421 

283816 

79871 

O 

[500,  1200) 

320032 

1929263 

13222694 

17080198 

5241637 

2218202 

890255 

141332 

B 

[1200,  3000) 

8673 

12141 

53304 

123132 

59683 

39014 

17051 

2805 

C 

[1200.  3000) 

8259 

16204 

97168 

151460 

56595 

43606 

26120 

2984 

D 

[1200,  3000) 

45303 

110802 

1193578 

1951024 

585532 

333172 

247795 

29100 

O 

[1200,  3000) 

218586 

2516479 

20054858 

38148145 

14552307 

8113608 

3645015 

242671 

B 

[3000,  5000) 

1494 

9164 

47800 

77434 

30430 

17990 

11636 

1260 

C 

[3000,  5000) 

1571 

2705 

24363 

15223 

9684 

8597 

4787 

373 

D 

[3000,  5000) 

0 

38 

159 

278 

137 

22 

26 

28 

O 

[3000,  5000) 

59902 

822607 

5610286 

11325094 

5793651 

4435405 

2843841 

142451 

B 

[5000,  18000] 

4094 

4984 

12720 

19086 

10292 

6720 

5553 

755 

C 

[5000,  18000) 

0 

0 

0 

11 

48 

0 

0 

0 

D 

[5000,  18000] 

0 

0 

0 

0 

0 

0 

0 

0 

O 

(5000,  18000] 

64081 

185566 

1421432 

3514892 

2983582 

3611450 

4160916 

515294 

This  process  of  randomly  assigning  values  to  each  of  the  variables  conditional  on  the  values 
of  their  parents  continues  until  all  of  the  variables  have  been  assigned.  For  this  example,  assume 
that  the  following  assignments  have  been  made: 


•  A  =  O 

•  L  =  [500. 1200) 

•  v=  [120,140) 

•  v  =  [-0.25,0.25) 

•  h  =  [—2000,  —1250) 

•  0  =  [-1.5, 1.5) 

Specific  values  within  each  bin  are  then  determined  based  on  a  different  uniform  random 
number  for  each.  In  the  case  where  a  bin  spans  0  (as  it  does  in  this  example  for  i ).  and  0),  the 
value  0  itself  is  assigned. 
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B.2  TRANSITION  NETWORK  SAMPLING 


We  have  described  how  to  sample  from  the  initial  Bayesian  network  to  generate  a  random  initial 
state.  The  next  step  is  to  use  the  dynamic  Bayesian  network  representing  the  transition  distribution 
to  generate  the  state  at  the  next  time  step  based  on  the  initial  state.  The  parameters  file  defines 
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The  graphical  representation  of  this  matrix  is  shown  in  Figure  B-‘2.  As  can  be  seem  the 
ordering  in  the  parameters  file  is  not  topologically  sorted  because  there  are  arrows  from  h(t  +  1) 
and  xp(t  +  1)  to  v(t  +  1)  but  v(t+  1)  comes  before  both  h(t  +  1)  and  0(£  4- 1)  in  the  list  of  variables  in 
the  parameters  file.  We  need  to  topologically  sort  the  list  of  variables  to  make  sure  that  we  assign 
values  to  variables  in  the  right  order.  One  possible  topological  sort  involves  simply  moving  i)(t  T  1) 
to  the  end  of  the  list.  This  sorting  only  affects  the  order  in  which  the  variables  are  sampled,  not 
the  semantics  of  the  ordering  in  the  parameters  file. 
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Figuir  B-2.  A  gmphical  representation  of  the  transition  network. 


We  are  only  interested  in  assigning  new  values  to  the  dynamic  variables,  namely  ii ( t  -f  1), 
</’(/  +  1)  and  v(t.  +  1).  The  process  is  similar  to  the  process  list'd  to  sample  from  the  initial  network. 
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First,  we  consult  the  table  of  sufficient  statistics  for  h(t  +  1).  This  table  of  conditional  counts 
is  shown  below.  For  sampling  purposes,  we  are  only  interested  in  the  row  that  represents  the 
current  variable  assignment,  i.e.,  A  =  O,  L  =  [500, 1200),  v  =  [120, 140).  h  =  [—2000,  —1250),  and 
ip  =  [—1.5, 1.5).  The  relevant  row  is  highlighted  in  Table  B-4. 
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Continued  on  next  page. 


TABLE  B-4.  Continued 
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We  look  at  the  row  of  interest  and  randomly  assign  a  value  to  h(t  +  1)  with  probability 
proportional  to  the  corresponding  elements  plus  1.  We  then  assign  ip(t  +  1)  to  a  random  value 
conditional  on  A ,  L,  t ).  and  v.  Finally,  we  assign  v{t  -1-  1)  based  on  the  assignments  of  v ,  v(t ), 
h(t  +  1),  and  ip(t  +  1).  The  process  is  repeated  for  each  time  step.  The  length  of  the  trajectory 
may  be  made  as  long  as  desired. 

Sampling  from  the  Bayesian  networks  produces  a  sequence  of  assignments  of  variables  to 
bins,  e.g.,  [250,750).  As  Section  4  describes,  the  next  step  is  to  sample  uniformly  within  the  bins 
to  produce  real  values  (writh  the  exception  of  a  bin  spanning  0  in  which  case  0  itself  is  selected). 
If  we  simply  sample  within  each  bin  at  each  time  step,  there  would  be  excessive  variability  in  the 
vertical  rates,  turn  rates,  and  acceleration.  We  therefore  only  resample  within  bins  at  mean  rates 
specified  in  the  parameters  file: 

000  0.019969  0.0394758  0.08752 

The  first  three  variables  have  zero  as  their  rates  because  they  do  not  change  during  the  course  of 
the  trajectory.  The  last  three  elements  specify  the  rates  with  which  v,  h,  and  ip  change  within  bins. 
As  the  trajectory  evolves,  whenever  a  variable  switches  to  a  new  bin  we  must  sample  within  the 
bin.  Typically,  the  values  of  variables  continue  in  the  same  bin  at  the  next  time  step.  If  the  bin 
remains  the  same,  we  either  continue  with  the  same  value  or  resample  within  the  bin  according  to 
the  specified  rates. 

Once  the  initial  conditions  and  a  series  of  control  variables  (i\  /i,  and  ip)  have  been  selected, 
the  aircraft  trajectory  is  constructed  or  simulated  using  an  appropriate  dynamic  model.  The  control 
variables  are  assumed  to  be  held  constant  across  each  1  s  time  step. 

B.3  VALIDATION 

One  way  to  validate  an  implementation  of  the  sampling  scheme  is  to  generate  a  large  collection 
of  samples  and  compare  the  generated  feature  distributions  against  the  distributions  shown  in 
Appendix  C  for  the  full  dataset  used  to  construct  the  model  (shown  in  black  in  Figures  C-l  and  C- 
2).  Since  the  histograms  only  show  the  marginal  distributions  for  the  variables  and  does  not  reveal 
the  important  correlation  between  variables,  an  approach  similar  to  that  discussed  in  Appendix  C 
is  appropriate.  So  long  as  the  ratio  in  Equation  C-l  is  large,  then  one  may  be  confident  that  the 
sampling  scheme  was  implemented  correctly.  When  we  generated  1,000  samples,  we  found  that  the 
natural  logarithm  of  the  ratio  was  1.4  x  103  for  the  initial  network  and  1.5  x  103  for  the  transition 
network,  indicating  with  extremely  high  confidence  that  our  1,000  samples  came  from  the  same 
distribution  as  the  model. 
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APPENDIX  C 

SEASONAL  AND  GEOGRAPHIC  COMPARISON 


The  model  described  in  this  report  represents  the  average  behavior  of  1200-squawking  aircraft 
in  the  continental  United  States.  If  this  model  is  to  be  used  for  more  focused  safety  studies,  for 
example  studies  concerned  with  only  a  particular  region,  it  is  important  to  understand  how  the 
behavior  of  a  smaller  population  compares  to  the  national  average. 

We  found  almost  no  seasonal  variation  of  trajectory  characteristics  in  the  radar  data.  Fig¬ 
ure  C-l  shows  the  distributions  over  the  features  that  define  the  model  for  the  full  dataset  and 
compares  them  with  data  from  December  and  June.  Although  the  behavior  of  the  aircraft  across 
seasons  are  similar,  the  variation  of  traffic  densities  do  vary  significantly.  Because  VFR  traffic  are 
restricted  to  visual  meteorological  conditions,  the  density  of  VFR  traffic  is  strongly  influenced  by 
weather.  In  areas  with  fairly  consistent  visual  conditions,  such  as  California,  the  traffic  density  is 
fairly  constant,  but  in  areas  with  varying  weather  condition,  such  the  Northeast,  the  traffic  density 
fluctuates  as  one  would  expect. 

To  determine  how  traffic  behavior  changes  with  location,  we  compared  the  feature  distribu¬ 
tions  for  traffic  captured  by  a  Los  Angeles,  California  sensor  and  a  Manchester,  New  Hampshire 
sensor.  As  Figure  C-2  shows,  the  distributions  over  airspace  class  and  altitude  layer  are  signifi¬ 
cantly  different  from  the  national  average.  This  difference  is  to  be  expected  because  both  the  Los 
Angeles  and  Manchester  sensors  are  located  in  terminal  areas.  There  are  also  significant  differences 
in  airspeed.  The  acceleration,  vertical  rate,  and  turn  rati1  distributions  are  more  comparable. 

We  hypothesized  that  the  geographic  variation  in  airspeed,  acceleration,  vertical  rate,  and 
turn  rate  was  due  to  differences  in  the  distribution  over  airspace  class  and  altitude  layer.  To 
test  this  hypothesis,  we  generated  new  distributions  from  the  national  data,  but  controlling  for 
airspace  class  and  altitude  layer.  We  used  the  Bayesian  network  parameters  estimated  from  the 
national  data  and  substituted  the  airspace  class  and  altitude  layer  parameters  estimated  from 
the  Los  Angeles  data.  We  sampled  from  the  resulting  joint  distribution  to  estimate  the  marginal 
distributions  for  the  airspeed,  acceleration,  vertical  rate,  and  turn  rate  variables.  If  the  differences 
in  the  feature  distributions  were  due  only  to  the  differences  in  airspace  class  and  altitude  layer, 
then  the  sampled  marginal  distributions  would  resemble  the  distributions  from  the  Los  Angeles 
data.  As  Figure  C-3  shows,  the  acceleration,  vertical  rate,  and  turn  rate  distributions  match,  but 
the  airspeed  distribution  remains  somewhat  different.  The  airspeed  near  terminal  areas,  such  as 
Los  Angeles  and  Manchester,  seem  to  be  somewhat  slower  than  the  national  average  would  predict, 
even  when  compensated  for  airspace  class  and  altitude.  If  a  safety  study  is  to  focus  specifically 
on  a  terminal  area,  it  may  be  worth  examining  how  the  distribution  in  airspeeds  differ  from  the 
national  average  and  adjust  the  model  accordingly. 

The  plots  of  the  marginal  distributions  of  the  various  features  provide  a  visual  way  of  compar¬ 
ing  the  airspace  properties  of  different  seasons  and  geographic  regions.  The  marginal  distribution 
plots  do  not  capture,  however,  the  relationships  (correlations)  between  variables,  which  is  especially 
important  when  comparing  the  dynamic  behavior  of  aircraft.  What  we  wish  to  determine  is  the 
probability  that  two  different  populations,  e.g.,  Los  Angeles  and  the  national  average,  have  the 


same  Bayesian  network  parameters  given  the  recorded  data  (assuming  that  the  graphical  structure 
of  both  networks  are  identical).  Let  us  denote  the  true  parameters  of  the  two  populations  as  6\ 
and  02  and  the  data  associated  with  the  two  populations  as  and  D2 .  We  wish  to  compute  the 
posterior  P{0\  =  62  |  -Di,  D2),  which  involves  multiplying  the  prior  distribution  of  the  hypothesis 
that  P{6\  =  62)  by  the  ratio 

PjDu  £2  I  01  =  02)  _  J'  PjDwDi  I  0)m  do  c 

P(Dl,D2)  [ j  P(Dl  |  0)p(0)d6]  [JP(D2  I  e)p{0)d0]  ' 

If  we  define  f(D)  =  f  P(D\6)p(6)  dO.  the  ratio  becomes 

/ (D\  U  Z)2) 
f(Di)f(D2)  ' 

As  shown  by  Cooper  and  Herskovits  [20]  and  using  the  notation  used  in  Appendix  I, 

f(D)  =  TT  TT  1  (f*bo)  TT 

f=,MrKo  +  ^)/fJj  FK,)  • 

We  computed  the  natural  logarithm  of  the  ratio  in  Equation  C-l  to  compare  the  feature 
distributions  from  different  data  sets  (see  Table  C-l).  The  more  positive  the  log-ratio  is,  the 
more  likely  the  distributions  are  the  same.  Negative  log-ratios  indicate  that  the  distributions  are 
different.  The  transition  distributions  for  different  seasons  and  different  geographic  locations  were 
found  to  be  statistically  similar.  We  conclude  that  although  there  is  spatial  and  temporal  variation 
in  traffic  density  and  spatial  variation  in  airspace  class,  altitude  layer,  and  airspeed,  the  evolution 
of  trajectories  follow  a  very  similar  stochastic  process  conditioned  on  airspace  class,  altitude  layer, 
and  airspeed. 


TABLE  C-l 

Log  ratio  comparing  the  transition  distribution  of  different  populations. 


Population  1 

Population  2 

Log  Ratio 

1-7  December  2008 

1  7  June  2008 

6.3  x  104 

Manchester,  Nil 

Los  Angeles,  CA 

2.3  x  104 

Manchester,  NH 

National  Average 

2.6  x  104 

Los  Angeles,  CA 

National  Average 

3.8  x  104 
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Vertical  rate  (ft/niin)  Turn  rate  (deg/s) 

Figure  C-L  Seasonal  feature  distribution  comparison .  Shown  in  black  are  the  distribution  of  features  from 
the  dataset  used  to  create  the  model .  which  came  from  recorded  radar  feeds  from  1  7  December  2007  and  1  7 
June  2008.  Shown  in  green  are  the  feature  distributions  from  only  1  7  December  2007.  Shown  in  red  are  the 
feature  distributions  from  only  1  7  June  2008.  As  the  plots  reveal ,  the  distributions  for  December  and  June 
are  nearly  identical. 


Airspace  Class 


1  0 

Acceleration  (kt/s) 


-2000  -1000  0  1000  2000  -10  -5  0  5  10 


Vertical  rate  (ft/min)  Turn  rate  (deg/s) 

Figure  C-2.  Geographical  feature  distribution  comparison.  Shown  in  black  are  the  distribution  of  features 
from  the  dataset  used  to  create  the  model  Shown  in  red  are  the  feature  distributions  from  the  LAXB  sensor 
located  in  in  Los  Angeles ,  California.  Shown  in  green  are  the  feature  distributions  from  A1HT  located  in 
Manchester ,  New  Hampshire. 
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Vertical  rate  (ft/ min)  Turn  rate  (deg/s) 

Figure  C-3.  Geographical  feature  distribution  comparison  corrected  for  airspace  class  and  altitude  layer. 
Shown  in  red  are  the  feature  distributions  from  the  LAXB  sensor  located  in  in  Los  Angeles,  California. 
Shown  in  black  are  the  feature  distributions  from  the  national  model  but  adjusted  to  inflect  the  airspace  class 
and  altitude  layer  distributions  in  Los  Angeles. 
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APPENDIX  D 

NONCOOPERATIVE  TRAFFIC  COMPARISON 


Modeling  noncooperative  traffic  is  especially  important  when  evaluating  collision  avoidance 
systems  for  unmanned  aircraft.  Unmanned  aircraft  typically  rely  upon  electro-optical  or  primary 
radar  sensors  to  detect  aircraft  without  transponders.  This  report  described  a  general  framework 
that  can  be  used  for  constructing  an  uncorrelated  encounter  model  from  any  data  source,  but  the 
model  was  based  on  cooperative  1200-code  tracks  captured  by  secondary  surveillance  radar.  A 
key  question  is  whether  such  a  model,  based  on  cooperative  aircraft,  can  be  used  as  a  surrogate 
for  noncooperative  aircraft.  Many  noncooperative  aircraft  are  single-engine  fixed-wing  airplanes 
similar  to  those  that  fly  cooperatively  using  1200-codes.  Thus,  modeling  1200-code  characteristics 
will  include  these  conventional  types  of  noncooperative  aircraft.  On  the  other  hand,  there  are  also 
noncooperative  aircraft  that  typically  do  not  use  transponders  (e.g..  balloons)  and  so  would  not  be 
represented  in  the  model  described  in  this  report. 

Constructing  a  model  based  on  primary-only  radar  returns  from  aircraft  without  transponders 
is  significantly  more  c  hallenging,  as  discussed  in  the  introduction  of  this  report.  Of  the  generally- 
available  surveillance  radars  that  have  height-finding  capabilities,  height  estimates  are  typically 
quite  poor,  making  it  difficult  to  infer  altitude  and  vertical  rates.  Another  problem  with  primary- 
only  data  is  that  significant  clutter  can  disrupt  the  development  of  clean  aircraft  tracks  needed  to 
extract  features  such  as  turn  rate. 

To  assess  the  degree  to  which  the  model  reported  here  can  be  used  as  a  surrogate  for  some 
noncooperative  aircraft,  we  built  specialized  primary-only  target  classifiers.  These  classifiers  take 
as  input  radar  track  coordinates  and  output  whether  it  is  likely  that  the  track  represents  an  aircraft 
(as  opposed  to  a  Hock  of  birds  or  ground  clutter,  for  example).  Based  on  a  collection  of  recorded 
radar  data  from  8  terminal  radar  sensors,  we  developed  Gaussian  mixture  models  representing  the 
('lass-conditional  distributions.  The  details  of  this  analysis  arc  discussed  elsewhere  [21]. 

Two  classifiers  were  implemented.  The  first  classifier  was  constructed  using  supervised  learn¬ 
ing  techniques  and  based  oil  two  data  sets:  (1)  a  set  of  known  non-aircraft  primary-only  tracks: 
and  (2)  a  set  of  1200-code  cooperative  aircraft  tracks.  The  11011-aircraft  tracks  consisted  of  primary- 
only  tracks  within  class  B  airspace,  where  a  transponder  is  almost  always  required.  Although  it  is 
possible  that  some  aircraft  without  transponders  were  operating  in  class  B  airspace  (e.g..  during 
formation  flight),  we  are  confident  that  the  vast  majority  of  the  tracks  observed  within  (‘lass  B 
airspace  are  due  to  birds  and  other  11011-aircraft  phenomena.  The  second  data  set  based  oil  1200- 
code  tracks  were  obtained  by  filtering  for  the  appropriate  Mode  A  code  from  the  sensors  used  in 
the  analysis.  The  two  data  sets.  then,  provided  mutually-exclusive  examples  of  eases  known  with 
confidence  to  he  either  (1)  non-aircraft  tracks  or  (2)  aircraft  tracks. 

The  second  classifier  was  constructed  using  semi-supervised  learning  techniques.  Training 
sets  in  this  case  included:  (1)  a  set  of  known  non-aircraft  primary-only  tracks  similar  to  the  prior 
case:  and  (2)  a  large  collection  of  unlabeled  (unknown)  primary-only  tracks. 

Sets  of  primary-onl>  tracks  were  then  passed  into  these  classifiers,  which  in  turn  output  a 
three-way  decision:  the  track  was  either  (1)  “conventional” :  an  aircraft  similar  to  a  1200-code 
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aircraft;  (2)  “unconventional”:  an  aircraft  not  similar  to  a  1200-code  aircraft;  or  (3)  not  an  air¬ 
craft.  By  appropriately  tuning  the  classifiers  using  test  tracks,  it  was  possible  to  obtain  correct 
classifications  at  a  rate  above  90%. 

Figure  D-l  compares  feature  histograms  from  1200-code  beacon  tracks  against  the  noncoop¬ 
erative  aircraft  tracks  identified  by  the  two  classifiers.  Cooperative  aircraft  features  are  shown  in 
black,  representing  typical  1200-code  conventional  aircraft  behavior.  Primary-only  tracks  classified 
as  “conventional”  are  shown  in  red;  primary-only  tracks  classified  as  “unconventional”  are  shown  in 
green.  Not  shown  are  the  characteristics  of  tracks  identified  as  “not  aircraft”.  For  reference,  a  dis¬ 
tribution  of  glider  airspeeds  is  also  provided  in  cyan,  extracted  from  publicly-available  repositories 
of  recorded  GPS  tracks.5 

As  shown,  the  airspeeds,  accelerations,  and  turn  rates  of  “conventional”  primary-only  tracks 
are  comparable  to  cooperative  1200-code  aircraft.  Thus,  there  is  a  population  of  noncooperative 
aircraft  that  are  considered  to  be  adequately  captured  by  the  uncorrelated  model  described  in 
this  document.  The  airspeeds  of  “unconventional”  primary-only  tracks  are  significantly  lower  than 
1200-code  aircraft,  and  in  addition  have  somewhat  lower  airspeed  acceleration  and  more  frequent 
non- zero  turn  rates;  the  model  presented  here  is  therefore  not  considered  to  provide  an  adequate 
representation  of  characteristics  for  some  classes  of  aircraft. 

Gliders  are  likely  to  represent  a  signification  fraction  of  the  “unconventional”  noncooperative 
aircraft,  and  there  are  other  categories  of  unconventional  aircraft  such  as  balloons  (both  manned 
and  unmanned)  and  powered  ultralights.  Separate  encounter  models  are  currently  being  developed 
specifically  for  these  unconventional  aircraft. 

To  summarize,  features  from  a  certain  class  of  noncooperative  aircraft  are  adequately  captured 
by  the  uncorrelated  model  presented  in  this  document.  The  advantage  of  using  this  encounter 
model,  derived  from  cooperative  1200-code  aircraft,  is  that  the  aircraft  tracks  are  reliable  and 
contain  altitude  information.  The  result  is  greater  confidence  that  appropriate  aircraft  dynamics  are 
modeled,  and  it  is  therefore  advantageous  to  use  this  model  as  a  surrogate  for  many  noncooperative 
aircraft.  For  those  aircraft  that  behave  in  a  more  unconventional  manner,  such  as  gliders  and 
balloons,  a  separate  model  is  required  and  being  developed. 

A  more  focused  analysis  of  a  particular  operation  region  would  require  first  determining 
the  relative  populations  of  “conventional”  and  “unconventional”  noncooperative  aircraft  using  the 
classifiers  developed  here  operating  on  primary-only  radar  data.  Then,  collision  risk  estimates  from 
each  of  the  appropriate  models  would  be  combined  in  a  weighted  maimer  to  provide  an  overall, 
adjusted  risk  level  based  on  that  traffic  mix. 


5See  http: //soaringweb . org  and  http: //ww . paxaglidingf orum . com/modules .php?name=leonardo&op=list_ 
flights. 
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Acceleration  (kt/s) 


Turn  rate  (deg/s) 

Figure  D-l.  Feature  histograms  for  cooperative  1200-code  aircraft  (black),  conventional  noncooperative  air¬ 
craft  (red),  unconventional  non  cooperative  aircraft  (green),  and  known  glider  aircraft  (cyan,  airspeed  only). 
Vertical  rates  cannot  be  inferred  from  primar'y  only  tracks. 
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APPENDIX  E 

DYNAMIC  SIMULATION  VALIDATION 

This  appendix  describes  a  set  of  sample  trajectories  that  may  be  used  as  validation  for  those 
wishing  to  implement  their  own  dynamic  simulation  models.  A  text  file,  named  uncor_tracks.txt 
and  available  from  Lincoln  Laboratory,  contains  50  synthetic  tracks.  Each  track  is  50  seconds  in 
length  and  was  generated  by  sampling  model  parameters  from  the  dynamic  Bayesian  network  and 
then  simulating  the  aircraft  trajectory  in  Lincoln  Laboratory’s  dynamic  simulation  model. 

The  text  file  is  a  space-delimited  list  with  the  following  ten  columns: 

•  ID:  Each  row  begins  with  an  id  number.  All  rows  with  the  same  id  number  correspond  to 
a  single  track. 

•  Time  t:  Time  values  are  reported  once  per  second  in  the  text  file. 

•  Vertical  rate  h:  The  vertical  rate  is  reported  in  ft /min.  This  value  is  sampled  from  the 
dynamic  Bayesian  network. 

•  Airspeed  acceleration  v :  The  airspeed  acceleration  is  reported  in  kt/s  and  is  also  sampled 
from  the  dynamic  Bayesian  network. 

•  Turn  rate  t/d  Turn  rate  is  reported  in  deg/s  arid  is  also  a  variable  from  the  dynamic'  Bayesian 
network. 

•  Airspeed  v:  The  initial  airspeed  (kt)  is  defined  by  sampling  from  the  initial  Bayesian  net¬ 
work.  The  remaining  values  for  t  >  0  are  outputs  from  Lincoln  Laboratory’s  dynamic  simu¬ 
lation. 

•  North  position  x:  Each  aircraft  is  initialized  at  x  =  0  ft. 

•  East  position  y:  Each  aircraft  is  initialized  at  y  —  0  ft. 

•  Altitude  h:  All  sample  aircraft  trajectories  are  initialized  at  an  altitude  of  10.000 ft. 

•  Heading  t/n  All  aircraft  are  initialized  heading  due  north  (ip  =  0  deg). 

The  dynamic  variables  (/?,  t\  and  ip)  are  inputs  to  an  aircraft  dynamic  model  that  describe 
how  the  aircraft  maneuvers  throughout  the  simulation.  The  variables  .r,  y,  and  h  can  be  used 
to  compare  outputs  of  a  simulation.  If  an  aircraft  is  initialized  at  the  origin  with  zero  heading  and 
an  altitude  of  10,000  ft,  then  the  simulated  track  should  approximately  match  the  values  provided 
in  the  text  file. 

Exactly  matching  simulation  outputs  with  values  in  the  text  file  is  extremely  unlikely  due  to 
the  variety  of  acceptable  aircraft  models  in  simulation  (e.g.,  degrees-of- freedom,  transient  dynamics, 
simulation  step  size.  etc  ).  The  simulated  and  provided  tracks  only  need  to  approximate  each  other 
in  both  the  vertical  and  horizontal  plane.  Note  that  small  differences  in  heading,  resulting  from  a 


turn  early  in  the  simulation,  can  result  in  large  positional  deviations  further  into  the  simulation 
horizontal  errors  resulting  from  small  differences  in  aircraft  headings  are  acceptable  for  validating 
an  aircraft  dynamic  simulation  model  and  implementation  of  the  uncorrelated  encounter  model. 

Since  there  are  numerous  aircraft  models  that  may  be  correctly  implemented  in  simulation, 
this  appendix  does  not  define  thresholds  or  minimum  errors  for  validating  a  simulation.  Instead, 
we  only  require  that  simulated  aircraft  must  track  steady-state  inputs.  For  example,  if  sampling 
from  the  dynamic  Bayesian  network  results  in  an  aircraft  flying  straight  (i.e.,  0  =  0  deg/s)  that 
later  transitions  to  ip  =  2  deg/s,  then  the  aircraft  must  eventually  turn  clockwise  at  2  deg/s  the 
transient  between  ip  =  0  and  ip  —  2  can  differ  from  the  sample  trajectories. 
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APPENDIX  F 
TRACKING  AND  FUSION 


Converting  raw  radar  reports  into  tracks  that  are  usable  for  our  model  development  is  a 
two-stage  process.  The  first  stage  involves  forming  local  tracks  from  the  reports  associated  with 
each  sensor.  The  second  stage  involves  fusing  the  local  tracks  from  multiple  sensors  to  form  global 
tracks.  This  appendix  provides  a  brief  overview  of  tracking  and  fusion. 

We  use  the  tracking  algorithms  from  the  Mode  S  and  ASR-9  systems  [13],  the  two  most 
modern  sensors  in  the  Air  Traffic  Control  System.  The  beacon  correlation  algorithms  come  from 
Mode  S  and  the  primary  radar  correlation  algorithms  come  from  ASR-9  Processor  Augmentation 
Card  (9-PAC).  Both  systems  are  integrated  to  provide  a  consistent  track,  although  for  the  purpose 
of  this  report  we  ignore  primary  only  reports. 

After  the  reports  of  each  sensor  have  been  correlated  into  local  tracks,  we  can  fuse  the  local 
tracks  to  provide  a  global  picture  of  the  airspace.  Fusion  performs  the  following  functions: 

1.  Merge  tracks  from  multiple  sensors  that  correspond  to  the  same  aircraft  into  a  single  global 
track. 

2.  Compute'  the  speed  and  heading  of  each  global  track  to  permit  trajectory  predictions. 

3.  Correct  sensor  tracking  errors  that  would  have  led  to  split  global  tracks  and  thus  false  en¬ 
counters. 

We  use  a  track- to -track  fusion  method,  meaning  that  we  track  each  sensor's  individual  reports 
and  then  we  merge  all  of  the  tracks  [22].  The  main  advantages  of  track-to-track  fusion  over  report- 
to-track  fusion,  in  which  all  reports  are  correlated  directly  to  global  tracks,  are: 

1.  Bias  independence  for  velocity  determination  and  maneuver  detection. 

2.  Removal  of  short  update  interval  velocity  anomalies. 

3.  Reduced  likelihood  of  forming  clutter  tracks. 

4.  Reduced  likelihood  of  introducing  incorrect  data  points  into  the  track  due  to  correlation 
errors. 

Track  merging  employs  position,  velocity,  Mode1  A  code,  and  altitude  as  matching  attributes  over 
the  entire  track.  Track  merging  for  tracks  with  discrete  codes,  which  are  unique  within  an  area, 
employs  large  correlation  boxes  for  each  of  the  matching  attributes.  Other  tracks  (1200  code 
or  radar-only)  must  pass  more  stringent  position  tests  and  velocity  tests  in  order  to  be  merged 
together. 

Our  fusion  method  works  forward  in  time.  Thus,  there  is  often  doubt  in  whether  or  not 
tracks  from  multiple  radars  are  index'd  the  same  track  with  just  a  few  data  points.  In  cases  of 


doubt,  tentative  matches  are  remembered  that  can  be  upgraded  to  a  merge  after  more  scans  of 
information  are  obtained.  Merges  are  checked  each  scan  and  can  be  undone  if  later  found  to  be 
unsatisfactory.  Aircraft  code  changes  are  also  accommodated,  although  they  must  be  verified  by 
other  sensors  in  the  merge  set  of  the  global  track  before  being  accepted.  If  only  one  sensor  reports 
a  code  change,  it  is  assumed  that  the  sensor  had  a  track  swap,  and  the  merge  situation  is  altered 
accordingly. 

The  remainder  of  this  section  explains  how  we  add  local  sensor  tracks  to  global  tracks,  how 
to  estimate  track  velocity  as  part  of  the  fusion  process,  and  how  to  filter  for  encounters. 

F.l  ADDING  LOCAL  SENSOR  TRACKS  TO  GLOBAL  TRACKS 

In  order  to  facilitate  fusing  of  tracks  from  multiple  sensors  into  global  tracks,  we  break  the  con¬ 
tinental  United  States  into  20  NM  by  20 NM  bins.  Every  track  is  associated  to  a  geographic  bin. 
Whenever  the  fusion  process  receives  a  new  local  sensor  track,  or  a  later  report  for  an  as  yet  un- 
fused  local  track,  an  attempt  is  made  to  fuse  this  track  to  an  existing  global  track.  This  process 
is  performed  by  comparing  the  new  track  to  all  neighboring  global  tracks.  The  neighboring  global 
tracks  for  a  local  track  include  all  global  tracks  in  the  same  geographical  bin  as  the  local  track 
plus  global  tracks  in  the  surrounding  bins  (totalling  9  bins).  Several  tests  must  be  passed  for  the 
successful  fusion  of  the  single  track  to  a  global  track: 

1.  The  global  track  must  not  already  be  fused  to  another  local  track  from  the  new  track’s  sensor. 

2.  The  tracks  must  agree  on  Mode  A  code  (primary-only  tracks  automatically  pass). 

3.  If  the  code  agreement  was  on  a  discrete  code,  a  very  coarse  horizontal  positional  test  must 
be  passed. 

4.  If  the  code  agreement  was  on  1200  code,  or  no  codes,  a  tighter  positional  test  must  be  passed, 
as  well  as  altitude  and  velocity  tests.  However,  if  only  the  velocity  test  fails,  a  potential  fusion 
is  declared;  3  successive  potential  fusions  with  the  same  global  track  results  in  a  successful 
fusion. 

If  more  than  one  possible  global  track  satisfies  the  fusion  tests,  the  one  with  the  highest  matching 
score  is  chosen.  Existing  fusion  matches  are  checked  each  time  a  new  report  is  received.  If  the 
tests  fail  for  three  scans  in  a  row,  the  fusion  is  ended,  and  a  new  global  track  is  sought  for  the  local 
sensor  track. 

F.1.1  Code  Matching  Test 

Normally,  the  code  of  the  new  track  is  the  same  as  the  code  of  the  global  track.  However,  code 
mismatching  can  complicate  the  fusion  process.  Code  declaration  errors  due  to  data  corruption, 
missing  codes,  and  code  changes  due  to  controller  action  are  all  common.  For  this  reason,  associated 
with  each  global  track  is  an  established  code  and  an  alternate  code,  which  is  the  code  of  the  most 
recent  report.  Usually  these  two  codes  are  the  same.  When  an  alternate  code  is  different  from 
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the  established  code  for  three  successive  reports,  however,  the  established  code  is  updated  to  the 
alternative  code.  Reports  with  no  beacon  code  are  ignored  in  this  process.  However,  if  a  track  has 
never  had  been  associated  with  a  beacon  code  report,  we  consider  this  track  to  be  a  primary-only 
and  give  the  track  an  established  code  of  0. 

If  both  the  local  track  the  global  track  have  a  beacon  code,  then  a  successful  code  match  is 
declared  if  any  of  the  following  statements  are  true: 

1.  The  established  codes  match. 

2.  The  alternate  codes  match. 

d.  One  of  the  established  codes  and  the  other  alternate  code  match. 

However,  failure  is  declared  if  the  match  is  on  code  0.  and  both  tracks  have  a  beacon  code  in  the 
other  code  slot  that  do  not  match.  Lastly,  if  one  track  is  radar  only,  and  the  other  track  has  a 
beacon  code,  failure  is  declared. 

To  handle  local  track  code  changes  due  to  a  track  swap  in  the  single  sensor  tracker,  confir¬ 
mation  of  the  code  change  by  the  global  track  is  required.  If  at  the  time  of  the  local  track  code 
change,  the  global  track  has  had  an  update  by  a  different  sensor's  track  with  the  old  beacon  code, 
a  track  swap  is  declared,  and  the  local  track  is  removed  from  fusion  with  the  global  track.  The 
local  track  then  undergoes  a  new  fusion  process. 

F.1.2  Horizontal  Position  Matching  Test 

Horizontal  positional  matching  requires  agreement  between  the  global  track's  most  recent 
horizontal  position  xg,i/g  and  the  new  local  track’s  horizontal  position  x\.  y\  projected  back  to  the 
time  of  the  global  track.  This  test  is  simple  if  both  track's  reports  contained  altitude.  If  a  radar 
report  contains  altitude  h,  range  p,  azimuth  9.  then  the  track's  horizontal  position  x.y  can  be 
empirically  determined. 

If,  however,  the  altitude  of  a  track  is  unknown,  then  the  tracks*  altitude  has  to  be  assumed 
(or  guessed)  in  order  to  derive  the  track’s  horizontal  position.  Simply  guessing  an  altitude  may 
produce  a  erroneous  x, y  position.  In  order  to  use  a  reasonable  altitude  value,  we  employ  the 
following  algorithm: 

1.  If  only  one  track  has  known  altitude,  then  we  convert  the  other  track’s  stored  p.  0  position 
to  ,i\  y  using  the  first  track’s  altitude. 

2.  If  neither  track  has  known  altitude  (which  is  always  true  for  a  primary-only  match),  we 
consider  all  altitudes  from  ONM  to  7NM  at  1  NM  steps.  We  then  use  the  altitude  that 
produces  the  closest  positional  match  between  the  two  tracks.  While  a  smaller  step  size  may 
produce  more  accurate  estimates,  we  found  that  1  NM  is  sufficient  for  fusing  two  tracks. 


Horizontal  positional  agreement  is  declared  if  the  horizontal  distance  between  the  two  tracks 
is  less  than  an  acceptable  value: 

\] (xg  -  *i)2  +  (yg  -  yi)2  <  Ar  max  “1“  3(7 azPg  “I”  3(7az  Pi,  (F-l) 

where  we  use  Armax  =  20  NM  for  a  discrete  code  match  and  Armax  =  1NM  for  a  1200  code  or 
radar  match.  The  standard  deviation  for  horizontal  position  error  terms  craz  account  for  positional 
errors  due  to  azimuth  noise  in  radar  measurements  (which  is  the  dominant  source  of  horizontal 
position  error).  We  model  the  standard  deviation  of  the  azimuth  noise  as  3  milliradians  for  the 
data  format  of  our  radar  feed. 

F.1.3  Altitude  Matching  Test 

Altitude  matching  requires  agreement  between  the  local  and  global  track  altitudes  when 
both  are  known.  Two  comparisons  are  tested;  the  success  of  either  test  results  in  a  match.  The 
comparison  test  is: 

At, 

A  hi 
A  h\ 

A  hy 

Ah 

where  we  use  A/?max  —  600  ft  and  A frmax  —  100  ft/s.  Since  the  altitude  of  a  track  can  significantly 
change  between  sequential  reports,  we  test  both  the  most  recent  and  previous  local  track  altitudes 
with  the  most  recent  global  track  altitude  update.  Only  one  of  the  local  track  altitudes  is  required 
to  pass  the  test. 

F.1.4  Velocity  Matching  Test 

Velocity  matching  requires  agreement  between  the  two  track  headings  *0  and  speeds  s  accord¬ 
ing  to  the  following  tests: 
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where  we  use  A^max  —  45°  and  Asmax  =  100  kt.  The  last  test  is  needed  for  slow  aircraft  and 
clutter  tracks,  to  prevent,  for  example,  speeds  of  20  and  llOkt  from  agreeing. 

F.2  DETERMINING  TRACK  AIRSPEED  AND  HEADING 

Determining  a  global  track’s  airspeed  and  heading  is  a  two  step  process.  First,  the  individual 
sensor  tracks  are  smoothed.  Second,  the  individual  track  are  averaged  using  relative  weights  that 
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account  for  sensor  update  times  and  the  quality  of  eaeli  sensor's  measurement.  We  apply  both 
alpha  smoothing  and  curve  fitting  to  determine  airspeed  and  heading,  depending  upon  the  track 
situation.  We  have  implemented  various  maneuver  detection  algorithms,  and  tracking  is  dependent 
upon  the  current  turn  rate  arid  acceleration  states  of  the  track. 

F.2.1  Local  Track  Smoothing 

First,  we  require  that  the  track  has  moved  a  minimum  distance  for  it  to  be  considered.  If 
the  track  never  moves  more  than  1  NM.  then  the  track  is  thrown  out.  After  the  movement  test  is 
satisfied,  the  track's  airspeed  and  heading  are  calculated  from  the  new  and  previous  positions.  We 
then  update  the  local  track’s  airspeed  and  heading  estimates  using  alpha  smoothing. 

First,  the  current  heading  estimate  and  its  difference  from  the  previous  estimate  t/1^ 
are  given  by 


^(j)  —  atan2  ^).  (//^  —  y ^ 

=  t 


Next,  we  determine  the  current  turn  rate  state  Sv  of  the  track: 


s['l) 


1  it  At’^  ^heading 

1  if  AvU>  >  A(/’,nm 
<  —2  it  A l. ' '' '  <  ^heading 

-1  if  A(,’(j)  <  -AV’niin 
0  otherwise 


(F-2) 


where  we  use  Avmin  —  3°  and  ^heading  is  the  standard  deviation  of  the  heading  noise,  which  is 
calculated  from  the  standard  deviations  for  range  and  azimuth  noise  of  the  sensors.  Note  that  a 
positive  A0  value  corresponds  to  a  right  turn,  while  a  negative  value  corresponds  to  a  left  turn. 
We  then  use  to  determine  the  smoothing  value  alpha  o  in  Table  F-l  of  the  individual  tracks 

that  will  be  used  to  calculate  the  heading  of  the  global  track  at  the  current  time. 

The  new  track  heading  is  finally  given  by: 

-f  a  x  Ac,(j) 


and  we  iterate  through  this  process  for  the  entire  track. 

The  process  to  estimate  airspeed  s  is  similar,  with  one  important  difference.  If  successive 
positions  are  siinplv  connected,  then  the  airspeed  estimates  will  always  he  too  high,  since  the 
aircraft  will  appear  to  “zig-zag”  along  the  track.  Thus,  only  the  projection  of  the  velocity  vector 
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TABLE  F-l 


Smoothing  values  depending  on  the  current  and  previous  turn  states. 


Previous  State 

Current  Turn  State 

Large 

Left 

Turn  (-2) 

Small 

Left 

Turn  (-1) 

No  Turn 
(0) 

Small 

Right 

Turn  (+1) 

Large 

Right 

Turn  (+2) 

Large  Left  (-2) 

0.7 

0.7 

0.4 

0.5 

0.5 

Small  Left  (-1) 

0.7 

0.4 

0.4 

0.5 

0.5 

No  Turn  (0) 

0.4 

0.4 

0.3 

0.4 

0.4 

Small  Right  (+1) 

0.5 

0.5 

0.4 

0.4 

0.7 

Large  Right  (+2) 

0.5 

0.5 

0.4 

0.7 

0.7 

onto  the  track’s  heading  vector  is  used  to  determine  the  track’s  airspeed: 


(?)  _  ( I (x^  —  ^)2  +  {y^  —  ^)2 

C°s  —  J  x  y  t'3)-t'i-  i) 

As^  = 


We  then  determine  the  current  airspeed  acceleration  state  Ss  of  the  track  using  a  similar 
technique  as  we  did  for  turn  rate. 


2 

1 

-2 

-1 
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if  As^^  ~^> 
if  A.s(j)  >  A.Smin 
il  As^  <  cr^eading 
if  A.sw  <  —  Asmi„ 
otherwise 


where  we  use  Asmin  =  18  kt  and  <rSpeed  is  the  standard  deviation  of  airspeed  error  due  to  noise 
in  range  and  azimuth  measurements  from  the  radar  sensors.  The  speed  smoothing  and  the  speed 
alpha  table  rules  are  the  same  as  for  the  heading  case. 


F.2.2  Global  Track  Smoothing 


In  order  to  determine  a  global  track’s  airspeed  and  heading  at  each  measurement,  we  use  a 
weighted  least  squares  estimation  approach.  In  this  section  we  describe  in  detail  the  approach  for 
determining  the  tracks  heading. 


First,  each  sensor’s  heading  estimate  is  assigned  a  weight  at  the  current  time  t ^  as  follows: 

^max 


wt]  =  ^heading  X 


fo>+io-n  _  t{c) 


tr 
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where  ^heading  is  the  standard  deviation  of  the  heading  noise  and  £rnax  =  18  s  is  a  discounting 
factor  that  takes  into  account  the  time  difference  between  the  measurement  from  the  sensor  being 
considered  and  the  time  for  when  we  are  determining  heading.  The  time  fj ^  corresponds  to  the 
time  of  the  next  closest  measurement  for  sensor  i  with  respect  to  the  current  time  that  we  are 
trying  to  determine  the  track's  heading. 

Next,  we  determine  the  total  turn  state  score  for  the  track 


where  N  is  the  number  of  sensors  supporting  the  track  and  Sj  is  the  current  turn  state  value  for 
the  / th  sensor  defined  in  Equation  F-2.  If  the  turn  rate  score  is  less  than  Ar.  then  we  consider  the 
track  to  be  non-turning  and  the  current  global  heading  is  simply  the  weighted  average  of  the  N 
sensor  heading  estimates: 
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Otherwise,  if  the  turn  rate  score  is  greater  than  or  equal  to  Ar,  then  we  consider  the  track 
to  be  in  a  turn.  In  this  case,  we  utilize  weighted  least  squares  estimated  to  determine  a  first-order 
relationship  between  time  and  heading. 

The'  global  track  speed  calculation  is  identical  in  form  to  the  global  track  heading  calc  ulation. 
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APPENDIX  G 
SENSORS 


The  following  table  lists  the  RADES  sensors  that  contributed  to  the  model.  Shown  are  the 
number  of  hours  of  Hight  time  by  transponder-equipped  aircraft  squawking  1200  captured  by  each 
sensor  between  1  December  2007  and  7  December  2007  and  between  1  June  2008  and  7  June  2008. 
There  was  a  total  of  200.000  raw  1200-code  hours  from  130  sensors.  Note  that  after  fusion,  these 
200.000  hours  of  reports  correspond  to  only  74.000  flight  hours. 


TABLE  G-l 

Beacon  hours. 

60  NM  range  sensors 

1131.05  m 

ACT  Waco.  TX.  USA  (ASR-11) 

284.29  1 

ADW  Camp  Springs  (Andrews  AFB).  MD.  USA  (ASR-9) 

26.23 

BGR  Bangor,  ME.  USA  (ASR-8) 

965.78  ■ 

BOS  Logan  Inti  (Boston),  MA.  USA  (ASR-9) 

575.35  ■ 

BUF  Buffalo  Inti.  NY.  USA  (ASR-9) 

1055.71  urn 

BWI  Baltimore  (BW1).  MI).  USA  (ASR-9) 

782.69  m 

COS  Colorado  Springs.  CO.  USA  (ASR-11) 

477.42  ■ 

COU  Columbia  MO.  MO,  USA  (ASR-11) 

213.16  B 

CRW  Charleston,  WV.  USA  (ASR-8) 

254.29  1 

DCA  Washington  National.  DC.  USA  (ASR-9) 

1006.08  £3 

DOV  Dover  AFB.  DE.  USA  (GPN-20) 

2116.03 

EWR  Newark,  N.I.  USA  (ASR-9) 

1785.01  mm 

ELL  Ft  Lauderdale  (Hollywood  Inti).  FL.  USA  (ASR-9) 

912.35  m 

GRK  Ft  Hood,  TX.  USA  (ASR-9) 

2494.87  mmm 

HPN  White  Plains  (Westchester  Co).  NY.  USA  (ASR-9) 

472.2  ■  HUF  Terre  Haute,  IN.  USA  (ASR-8) 


802.05  ■ 

1AD  Washington  Dulles  (Chantilly).  DC.  USA  (ASR-9) 

2599.42 

JFK  New  York  (JFK).  NY.  USA  (ASR-9) 

4318.76  wmmmm 

LAXB  Los  Angeles  Inti  -  North.  CA.  USA  (ASR-9) 

506.15  1  LFT  Lafayette.  LA.  USA  (ASR-11) 

1519.64  ■  MDT  Harrisburg.  PA.  USA  (ASR-9) 

1607.94  mm  MHT  Manchester,  NIL  USA  (ASR-9) 

310.94  l  MLU  Monroe  LA.  LA,  USA  (ASR-8) 

707.38  ■  MRB  Martinsburg.  WV.  USA  (ASR-9) 

203.04  I  MUG  Mountain  Home  AFB.  ID.  USA  (GPN-20) 
137.65  i  NQXA  Key  West  NAS.  FL.  USA  (GPN-27) 
Continued  on  next  page. 
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431.77  I 
73.94 
457.75  □ 
71.62 

2419.56  — 
529.39  E3 
374.99  B 
446.86  ■ 
1164.49  m 

3994.25  maaMwa 


810.42  a 


547.92  £3 
208.73  1 

349.35  0 
977.4  □ 
828.06  O 
554.87  □ 
347.89  D 
1602.96  dH 
4930.86  I.  .  I 

990.45  £3 
2693.26  mcf.  J 
1277.87  CH 
469.98  □ 
2250.79  r~~3 


376.86 
1512.66  EZ 
1479.68  d! 
3697.68 
3266.18 


2184.12  PawiM 
1471,06  E3 
1011.95  □ 
691.05  E3 
1999.77  r~~3 


TABLE  G-l.  Continued 

PWM  Portland  (Cumberland),  ME,  USA  (ASR-9) 

RCA  Ellsworth  AFB  (Rapid  City).  SD.  USA  (GPN-20) 
SAV  Savannah,  GA.  USA  (ASR-8) 

SAW  Marquette  (KI  Sawyer),  MI,  USA  (ASR-7F) 

SCK  Stockton,  CA,  USA  (ASR-11) 

SDF  Louisville,  KY,  USA  (ASR-9) 

SJU  San  Juan,  PR,  USA  (ASR-8) 

STT  St. Thomas,  PR,  USA  (ASR-8) 

SWF  Newburgh  (Stewart),  NY,  USA  (ASR-9) 

XMR  Cape  Canaveral,  FL,  USA  (GPN-30) _ 

120  NM  range  sensors 

NHK  Patuxent  River  NAS,  MD,  USA  (ASR-11) 

200  NM  range  sensors 

A  EX  Alexandria,  LA,  USA  (AN/FPS-20A) 

AMA  Amarillo,  TX,  USA  (AN/FPS-67B) 

ATL  Atlanta  (Marietta),  GA,  USA  (ARSR-1E) 

BAM  Battle  Mountain,  NV,  USA  (ARSR-2) 

CDC  Cedar  City,  UT.  USA  (ARSR-2) 

CPV  Coopersville,  MI,  USA  (AN/FPS-66A) 

DSV  Buffalo  (Dansville),  NY,  USA  (ARSR-1E) 

FLX  Fallon,  NV,  USA  (AN/FPS-66A) 

FPK  Salt  Lake  City  (Francis  Peak),  UT,  USA  (ARSR-1E) 
FTW  Ft  Worth  (Keller),  TX.  USA  (ARSR-1E) 

GUP  Gallup  (Farmington),  NM,  USA  (ARSR-2) 

HOU  Houston  (Ellington),  TX,  USA  (ARSR-1E) 

IND  Indianapolis,  IN,  USA  (ARSR-1E) 

IRK  Kirksville,  MO,  USA  (ARSR-3) 

JOL  Elwood  (Joliet),  IL,  USA  (ARSR-3) 

LSK  Lusk.  WY,  USA  (ARSR-2) 

MGM  Montgomery,  AL,  USA  (ARSR-1E) 

PIT  Pittsburgh  (Oakdale),  PA.  USA  (AN /FPS-67B) 

QAS  Las  Vegas  (Angel  Peak),  NV,  USA  (AN/FPS-20E) 
QBE  Roanoke  (Bedford),  VA,  USA  (ARSR-3) 

QBN  Binns  Hall,  VA,  USA  (ARSR-3) 

QBZ  Oskaloosa,  KS,  USA  (ARSR-2) 

QCF  Du  Bois  (Clearfield),  PA.  USA  (ARSR-3) 

QCK  Boise  (Cascade),  ID.  USA  (ARSR-2) 

QDB  Cleveland  (Brccksville),  OH.  USA  (ARSR-1E) 

Continued  on  next  page.  . 
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TABLE  G-l.  Continued 

QDT  Detroit  (Canton),  MI.  USA  (ARSR-1E) 

QHA  Cummington,  MA.  USA  (AN/FPS-67B) 

QHB  St.  Albans.  VT.  USA  (AN/FPS-67B) 

QHZ  Horicon,  WI.  USA  (ARSR-2) 

QJB  Gettysburg,  SD.  USA  (AN/FPS-67B) 

QJE  Minneapolis  (Apple  Valley),  MN.  USA  (ARSR-1E) 
QJO  Arlington.  1A.  USA  (ARSR-3) 

Q.IQ  San  Juan  (Pico  del  Este).  PR.  USA  (AN/FPS-20E) 
QNK  Lincolnton.  GA.  USA  (ARSR-3) 

QNM  Newport,  MS.  USA  (ARSR-3) 

QOJ  Nashville  (Joelton).  TN.  USA  (ARSR-1E) 

QPC  Haleyville.  AL.  USA  (AN/FPS-67B) 

QPK  Denver  (Parker),  CO,  USA  (ARSR-1E) 

QPL  The  Plains.  VA.  USA  (ARSR-3) 

QRI3  Citronelle,  AL.  USA  (ARSR-2) 

QRC  Benton,  PA.  USA  (AN/FPS-67B) 

QRI  Lynch.  I\Y.  USA  (ARSR-2) 

QRL  Raleigh  (Benson).  NC.  USA  (ARSR-1E) 

QRM  Charlotte  (Maiden),  NC.  USA  (ARSR-1E) 

QSA  Albuquerque  (West  Mesa),  NM.  USA  (AN/FPS-66A) 
QSI  Lovell.  WY.  USA  (ARSR-2) 

QSR  Boron.  CA.  USA  (AN/FPS-67B) 

QTZ  La  Grange.  IN.  USA  (ARSR-1E) 

QUZ  Hanna  City.  IL.  USA  (AN/FPS-67B) 

QWC  Mesa  Rica.  NM,  USA  (ARSR-1E) 

QWO  London.  OH,  USA  (ARSR-1E) 

QXR  Russellville.  AR.  USA  (AN/FPS-64A) 

QXS  Odessa  (Andrews).  TX.  USA  (ARSR-1E) 

QYB  Byhalia  (Memphis),  MS.  USA  (ARSR-1E) 

QYS  Rogers.  TX.  USA  (ARSR-1E) 

SEA  Seattle  (Ft  Lawton).  WA,  USA  (ARSR-1E) 

STL  St.  Louis,  MO,  USA  (ARSR-1E) 

SVC  Silver  City.  NM,  USA  (ARSR-2) _ 

250  NM  range  sensors 
A  JO  Ajo,  AZ.  USA  (ARSR-4) 

BAR  Barrington,  NS.  Canada  (AN/FPS-117) 

CTY  Cross  City,  FL.  USA  (ARSR-4) 

DMN  Denting  (Magdelina  Peak).  NM.  USA  (ARSR-4) 

LCH  Lake  Charles.  LA.  USA  (ARSR-4) _ 

Continued  on  next  page 


7521.59 


4168.08 


588.16  ■ 
363.15  D 
337.08  B 


61 14.05 
7649.78 


TABLE  G-l.  Continued 
MLB  Melbourne,  FL,  USA  (ARSR-4) 

NEN  Jacksonville  (White  House  FI),  FL,  USA  (ARSR-4) 
NEW  New  Orleans  (Slidell),  LA,  USA  (ARSR-4) 

NQX  Key  West,  FL,  USA  (ARSR-4) 

NSD  San  Clemente  Island,  CA,  USA  (ARSR-4) 

PAM  Panama  City  (Tyndall),  FL,  USA  (ARSR-4) 

PRB  Paso  Robles,  CA.  USA  (ARSR-4) 

QEA  North  Truro,  MA,  USA  (ARSR-4) 

QFN  Ft  Green,  FL.  USA  (ARSR-4) 

QIE  Gibbsboro,  NJ,  USA  (ARSR-4) 

QJA  Empire,  MI,  USA  (ARSR-4) 

QJD  Nashwauk,  MN,  USA  (ARSR-4) 

QKW  Makah,  WA.  USA  (ARSR-4) 

QMS  Tamiami,  FL,  USA  (ARSR-4) 

QMV  Mill  Valley,  CA,  USA  (ARSR-4) 

QNA  Morales,  TX,  USA  (ARSR-4) 

QNW  Eagle  Peak,  TX,  USA  (ARSR-4) 

QOM  King  Mountain,  TX,  USA  (ARSR-4) 

QRJ  Charleston  (Jedburg),  SC,  USA  (ARSR-4) 

QRW  Mt,  Laguna,  CA.  USA  (ARSR-4) 

QVH  Riverhead,  NY,  USA  (ARSR-4) 

QVR  Oceana,  VA,  USA  (ARSR-4) 

QWA  Watford  City,  ND.  USA  (ARSR-4) 

QXU  Utica  (Remsen),  NY,  USA  (ARSR-4) 

QYA  Bucks  Harbor,  ME.  USA  (ARSR-4) 

QYD  Caribou,  ME.  USA  (ARSR-4) 

QZA  Oilton,  TX.  USA  (ARSR-4) 

QZZ  Rainbow  Ridge,  CA,  USA  (ARSR-4) 

RSG  Rocksprings,  TX,  USA  (ARSR-4) 

SLE  Salem  (Laurel  Mountain),  OR.  USA  (ARSR-4) 
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APPENDIX  II 
DENSITY  PLOTS 


The  figures  in  this  appendix  show  the  density  of  cooperative  aircraft  squawking  1200  (VFR) 
in  the  radar  data  used  to  construct  the  model.  Similar  density  data  for  noncooperative  aircraft 
are  also  available  but  require  significantly  more  processing  due  to  clutter.  Density  is  measured  in 
average  number  of  aircraft  per  NM3.  Each  pixel  measures  1/6  of  a  degree  on  a  side  in  latitude 
and  in  longitude.  Depending  on  latitude,  the  square  mileage  of  a  pixel  may  be  between  64  NM2 
at  the  northernmost  latitude  to  92 NM2  at  the  southernmost  latitude.  The  highest  density  pixel 
is  in  Miami-Dade  Comity  and  has  a  density  of  approximately  0.003  aircraft  per  NM3  over  all 
altitude  layers  between  500  ft  AGL  and  FL180,  with  the  highest  density  layer  between  500  ft  AGL 
and  1200  ft  AGL. 

It  is  important  to  emphasize  that  this  density  map  is  the  observed  density  from  the  radar 
data  set.  Factors  such  as  terrain  and  curvature  of  the  earth  prevent  the  observation  of  some 
aircraft,  particularly  low  flying  aircraft  such  as  those  flying  VFR.  The  locations  of  the  sensors  are 
indicated  on  the  maps  by  magenta  circles.  It  is  no  coincidence  that  the  measured  density  in  the 
lowest  altitude  layer,  for  example,  tend  to  be  concentrated  around  the  locations  of  the  sensors. 
In  partic  ular,  coverage  at  low  altitudes  tends  to  be  sparser  in  the  western  portion  of  the  country. 
Coverage  above  5000  ft  AGL,  however,  is  fairly  good  across  the'  country. 

Certain  areas  on  the  maps  are  wortli  attention.  Due  to  the  protected  area  around  Washington 
DC.  there  are  virtually  no  1200  squawking  aircraft  at  any  altitude  layer  in  that  region.  By  contrast, 
there  arc*  is  a  large  concentration  in  Florida  arid  southern  Arizona,  especially  near  the  two  locations 
of  the  Embry-Riddle  flight  school  in  Daytona  Beach,  Florida  and  Prescott.  Arizona.  In  addition, 
the  effect  of  the  “upside  down  wedding-cake"  airspace  class  .structure  around  major  airports  can 
be  seen  if  one  compares  the  heavy  concentration  near  the  pixel  representing  Atlanta  and  Dallas 
at  lower  altitude  layers  against  the  empty  airspace  in  the  same  pixel  at  the  highest  altitude  layer. 
Finally,  it  can  be  seen  that  the  density  near  Denver  at  higher  altitudes  AGL  is  quite  low;  this  may 
be  attributed  to  the  fact  that  an  additional  5.000  ft  above  ground  in  that  region  puts  the  aircraft 
above  10.000  ft  MSL. 
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Figure  H-l.  Average  1200-code  cooperative  density  over  all  altitude  layers  up  to  FL180  in  aircraft  per  NAP  for  the  periods  1  7  December 
2007  and  1  7  June  2008.  White  indicates  no  VFR  traffic  reports  in  that  pixel. 
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Figure  H-3.  Average  1200-code  cooperative  density  between  1200  and  3000 ft  AGL  in  aircraft  per  NAP  for  the  periods  1  7  December  2007 
and  1  7  June  2008.  White  indicates  no  VFR  tmffic  reports  in  that  pixel. 
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Figure  H-5.  Average  1200-code  cooperative  density  between  5000 ft  AG L  and  10, 000 ft  MS L  in  aircraft  per  NAP  for  the  periods 
December  2007  and  1  7  June  2008.  White  indicates  no  VFR  traffic  reports  in  that  pixel. 
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APPENDIX  I 
BAYESIAN  NETWORKS 


This  appendix  briefly  reviews  Bayesian  networks.  Further  discussion  of  Bayesian  networks 
may  be  found  elsewhere  [23  25]. 


1.1  DEFINITION 

A  Bayesian  network  is  a  graphical  representation  of  a  multivariate  probability  distribution  over 

variables  X  =  AT . Xn .  In  particular,  a  Bayesian  network  is  a  directed  acyclic  graph  G  whose 

nodes  correspond  to  variables  and  edges  correspond  to  probabilistic*  dependencies  between  them. 
Associated  with  each  variable  AT  is  a  conditional  probability  distribution  P(xj  |  7r?).  where  7 r, 
denotes  an  instantiation  of  the  parents  of  AT  in  the  graph.  The  probability  of  an  instantiation 
of  the  variables  is  specified  directly  by  the  conditional  probability  distributions  in  the  Bayesian 
network: 

n 

P{x)  =  P(xi ,  .  .  .  ,X„)  =  Jl  P(x,  I  7Tt)  .  (1-1) 

1=1 


1.2  SAMPLING 

It  is  rather  straightforward  to  sample  from  a  multivariate  distribution  represented  by  a  Bayesian 
network.  The  first  step  is  to  produce  a  topological  sort  of  the  nodes  in  the  network.  A  topological 
sort  orders  the  nodes  in  a  Bayesian  network  such  that  if  a  node  A"*  comes  before  Xj  there  does  not 
exist  a  directed  path  from  X}  to  AT  Every  Bayesian  network  has  at  least  one  topological  sort,  but 
there  may  be  many.  Efficient  algorithms  exist  for  finding  a  valid  topological  sort  [26] . 

To  produce  a  sample  from  the  joint  distribution  represented  by  a  Bayesian  network,  we  simply 
iterate  through  a  topologically  sorted  sequence  of  the  variables  and  sample  from  their  conditional 
probability  distributions.  The  topological  sort  ensures  that  when  sampling  from  each  conditional 
probability  distribution  the  necessary  parents  have  been  instantiated. 


1.3  PARAMETER  LEARNING 

The  parameters  6  of  a  Bayesian  network  determine  the  associated  conditional  probability  distri¬ 
butions.  Given  some  fixed  network  structure  G,  we  can  learn  these  parameters  from  data.  In  this 
appendix,  we  assume  that  the  variables  are  discrete. 

Before  discussing  how  to  learn  the  parameters  of  a  Bayesian  network,  it  is  necessary  to 
introduce  some  notation.  Let  r7  represent  the  number  of  instantiations  of  AT  and  q,  represent 
the  number  of  instantiations  of  the  parents  of  AT-  If  AT  has  no  parents,  then  qt  =  \.  The  j tli 
instantiation  of  the  parents  of  AT  is  denoted 


There  are  ri9i  parameters  in  a  Bayesian  network.  Each  parameter  is  written  9jjk  and 

determines  P(Xt  =  k  \  n zj),  i.e., 

P(Xt  =  k  I  7 Tij)  =  9ljk  . 

Although  there  are  parameters,  only  —  1  )q7  are  independent. 


rule 


Computing  the  posterior  p(9  \  D.G)  involves  specifying  a  prior  p(9  \  G)  and  applying  Bayes’ 


P(e\D,G)  = 


PjDj^GMejG) 

P{D  |  G) 


P(D  |  0.G)p(0  |  G ) 
f  P(D  |  0.  G>(0  |  G)  d0  ' 


(1-2) 


If  Nijk  is  the  count  of  Xj  =  k  given  7Tjj  in  the  data  D<  then  the  probability  of  the  data  given 
the  parameters  0  is 


n  qi 


p(Di»)=nnn< 


NiJk 

ijk 


i=l  j=l  k=  1 


(1-3) 


Let  Oij  =  . . . ,  9ijVi).  Since  9tJ  is  independent  of  d^jt  when  ij  ^  ifj\  the  prior  probability 

of  the  parameters  assuming  a  fixed  structure  G  is 

p(O\G)  =  flf[p(0ij\G).  (1-4) 

2—1  j  — 1 


The  density  p(07J  \  G)  is  a  distribution  over  relative  frequencies.  Under  some  very  weak  assump¬ 
tions,  it  is  possible  to  prove  that  p(9ij  |  G )  is  Dirichlet  (see  [25] ,  Section  6.2.3).  Hence. 


P(0ij  I  G)  = 


_ r(ofijo) 

Til  i 
0 


TTl  Qa' 
ilk=l 


ijk 


if  0  <  9 ijk  <  1  and  hik  =  1 

otherwise 


where  ayi, . . . ,  Ckijn  are  the  parameters  of  the  Dirichlet  distribution  and  aZJo  =  &ijk-  For 

the  prior  to  be  objective  (or  noninformat  ive),  the  parameters  aljk  must  be  identical  for  all  k. 
Different  objective  priors  have  been  used  in  the  literature.  Cooper  and  Herskovits  [20]  use  aijk  =  1. 
Heckerman,  Geiger,  and  Chiekering  [27]  use  and  justify  aijk  =  1  / (n<?z). 

It  is  possible  to  show  that  p{9-Lj  |  D .  G)  is  Dirichlet  with  parameters  (*ijk+Nijk, ....  c %ijk+Njjk . 
Hence, 


p(9ij  |  Z>,  6?)  = 


+  TTri  1 

Ilk  1  1  U-=1  ”ijk 

0 


if  0  <  0ijk  <  1  and  ^’=1  0ljk  =  1 
otherwise 


where  TVjj  =  ^Li 

Sampling  from  a  Bayesian  network  with  G  known,  9  unknown,  and  D  observed  involves 
assigning  k  to  X7  with  probability 

P(Xi  =  k  I  7 Tij,  D.  G)  =  J  9ijkp(9ijk  I  D.  G)  d 9ijk  =  '  (I'5) 
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1.4  STRUCTURE  LEARNING 


Finding  the  most  likely  structure  G  that  generated  a  set  of  data  D.  The  objective  is  to  find  the 
most  likely  graph  given  data.  By  Bayes'  rule, 

P(G  |  D)  a  P(G)P(D  |  G)  =  P{G)  j  P{D  \  0.  G)p(6  \  G)  <10  .  (1-6) 

The  previous  section  explains  how  to  compute  the  likelihood  P(D  |  0.  G)  and  the  prior  p(0  I  G). 
Cooper  and  Herskovits  [20]  show  how  to  evaluate  the  integral  above,  resulting  in 


P(G  |  D)  =  P{G)  IIII 


C(<Vjjo) 


1?  ,  r(o,_,o  +  Arij) 


II 

k=  1 


^(aijk  +  Njjk) 

n*ijk) 


(1-7) 


where  NtJ  =  ^^.=  1  ^ijk-  Heckerman,  Geiger,  and  Cliickering  [27]  suggest  priors  over  graphs,  but 
it  is  not  uncommon  in  the  literature  to  assume  a  uniform  prior.  For  numerical  convenience,  most 
Bayesian  network  learning  packages  calculate  and  report  log  P(G  \  D)  T  A  ,  where  I\  is  a  constant 
independent  of  G.  This  quantity  is  often  called  the  Bayesian  score  and  may  be  used  for  structure 
comparison  and  search. 
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APPENDIX  J 
NETWORK  CANDIDATES 


This  section  displays  a  selection  of  various  network  structures  along  with  their  log-Bayesian 
score,  number  of  edges,  and  number  of  parameters.  Bayesian  model  selection  balances  the  com¬ 
plexity  of  the  model  according  to  the  amount  of  data  available. 

J.l  INITIAL  NETWORK  CANDIDATES 


Score  =  -961638874  (best).  Edges  =  15,  Parameters  =  31359 


89 


Score  =  -966691272.  Edges  =  14,  Parameters  =  9855 


Score  =  -966694938,  Edges  =  14,  Parameters  =  9855 
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Score  =  -967537835,  Edges  =  13.  Parameters  =  7167 


Score  =  -968020869.  Edges  =  12,  Parameters  =  7083 
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Score  =  -974539518,  Edges  =  10,  Parameters  =  1775 


Score  =  -975871704,  Edges  =  10,  Parameters  =  1791 
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A 

L 

V 

V 

h 

0 

Score  =  -1012326420.  Edges  =  0.  Parameters  =  29 


J.2  TRANSITION  NETWORK  CANDIDATES 


v(t) 


m 


\ 


A 

T'X' 


a  t  ' 


h(t  + 1 ) 


_ fc. 


+ 1 ) 


J _ t— 


i>{i+  1) 


Score  -  -29174148  (best).  Edges  =  12,  Parameters  =  18592 


Score  =  -29214567,  Edges  =  14.  Parameters  =  74368 


Score  =  -29252097.  Edges  =  12.  Parameters  =  42784 


Score  —  -29391815,  Edges  =  10.  Parameters  =  11312 


Score  =  -29393031,  Edges  =11,  Parameters  =  9296 


Score  =  -29465324,  Edges  =  10,  Parameters  =  5936 


Score  =  -29791121,  Edges  =  21,  Parameters  =  7651840 


v(t) 

h(t) 

xp(t) 

i)(t+  1) 

h(t  + 1) 

■>p(t+i) 

Score  =  -407089095,  Edges  =  0,  Parameters  =  16 
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